repoguide

command module

v1.2.0 Latest Latest Go to latest Published: Feb 26, 2026 License: MIT Imports: 17 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/phobologic/repoguide

Links

Open Source Insights

README ¶

repoguide

Tree-sitter repository map in TOON format for LLM consumption.

What it does

repoguide parses a codebase with tree-sitter, extracts symbols (classes, functions, methods, imports), builds a file-to-file dependency graph, and ranks files by PageRank. The output is a compact TOON-formatted map designed to fit in an LLM context window.

The goal: give an LLM agent a high-level map of a codebase so it can explore more effectively — knowing which files matter most, what symbols they define, and how they depend on each other.

Written in Go for fast parallel parsing across CPU cores.

Installation

Requires Go 1.24+ and a C compiler (for tree-sitter CGo bindings).

go install github.com/phobologic/repoguide@latest

Or build from source:

git clone https://github.com/phobologic/repoguide.git
cd repoguide
go build -o repoguide .

Set the version at build time:

go build -ldflags "-X main.version=1.0.0" -o repoguide .

Usage

repoguide [ROOT] [OPTIONS]

Option	Description
`ROOT`	Repository root directory (default: `.`)
`--max-files`, `-n`	Limit output to top N files by PageRank (min: 1)
`--langs`, `-l`	Comma-separated languages to include (e.g., `python,go`)
`--cache`	Cache output to file; reuses if newer than all source files (add to `.gitignore`)
`--max-file-size`	Skip files larger than this many bytes (default: 1MB)
`--symbol`	Filter output to symbols matching this substring (case-insensitive)
`--file`	Filter output to files matching this substring (case-insensitive)
`--with-tests`	Include test files in output (excluded by default)
`--raw`	Output raw TOON without agent context header
`--version`, `-V`	Show version and exit

Example

By default, output includes a preamble header that explains the format for AI agent consumption. Use --raw to strip the header for bare TOON output.

$ repoguide /path/to/myproject -n 3
# Repository Map

This is a repository map generated by repoguide. It shows the structure,
key symbols, and dependencies of the codebase in TOON format.
...

---
repo: myproject
root: myproject
files[3]{path,language,rank}:
  myproject/models.py,python,0.2755
  myproject/languages.py,python,0.1183
  myproject/discovery.py,python,0.0608
symbols[17]{file,name,kind,line,signature}:
  myproject/models.py,TagKind,class,10,TagKind(enum.Enum)
  myproject/models.py,SymbolKind,class,17,SymbolKind(enum.Enum)
  myproject/models.py,Tag,class,27,Tag
  myproject/models.py,FileInfo,class,39,FileInfo
  ...
dependencies[1]{source,target,symbols}:
  myproject/discovery.py,myproject/languages.py,language_for_extension

Focused queries

Use --symbol and --file to get a targeted view instead of the full map. These are useful when asking Claude about a specific function or subsystem.

repoguide --symbol BuildGraph        # show BuildGraph: definition, callers, callees, import sites
repoguide --file internal/auth       # show all symbols and deps for auth package
repoguide --symbol Handle --file srv # combine: Handle symbol scoped to srv files

Both flags do case-insensitive substring matching and can be combined (AND semantics). When active, the cache is bypassed for reading but the full unfiltered output is still written to cache on the same run.

The --symbol output includes a callsites table with every call occurrence and every file-level import site, each with exact file and line number. Use those line numbers with Read(offset=N) for precise navigation without scanning.

Subcommands

`repoguide init`

repoguide init [--dry-run] [path-to-CLAUDE.md]

Writes a repoguide usage section to a CLAUDE.md file, creating it if it doesn't exist. The section instructs Claude Code to call repoguide at the start of tasks and explains how to read the output.

repoguide init                     # write to ./CLAUDE.md
repoguide init path/to/CLAUDE.md   # explicit path
repoguide init --dry-run           # print the generated section, no file written
repoguide init --dry-run CLAUDE.md # print what the full file would look like

The command reports what it did: created, updated, or already up to date. Safe to run repeatedly — skips the write when nothing has changed.

The block is wrapped in HTML sentinel comments so subsequent runs replace only that section, leaving surrounding content untouched:

<!-- repoguide:start -->
...generated content...
<!-- repoguide:end -->

Claude Code integration

The primary use case is running repoguide as a Claude Code hook so every subagent automatically gets a repo map injected into its context.

Add this to .claude/settings.json:

{
  "hooks": {
    "SubagentStart": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "repoguide \"$CLAUDE_PROJECT_DIR\" --cache \"$CLAUDE_PROJECT_DIR/.cache/repoguide.toon\""
          }
        ]
      }
    ]
  }
}

The SubagentStart hook fires when any subagent launches. repoguide's stdout is injected into the subagent's context, giving it an instant overview of the codebase. The default output includes a preamble header that explains the format, so the agent understands what it's looking at without any additional configuration.

--cache avoids re-parsing on every agent launch — the cache file is reused as long as no source files have changed. Add .cache/ to your .gitignore.

TOON format

The output uses TOON (Text Object Oriented Notation), a compact format designed for LLM consumption:

Scalar fields — key: value
Tabular arrays — name[count]{col1,col2,...}: followed by indented CSV rows
Quoting — values containing special characters are double-quoted; numbers and plain strings are bare

How it works

Discover files — uses git ls-files when available, falls back to .gitignore-based filtering
Parse with tree-sitter — extracts classes, functions, methods, and imports from each file
Build dependency graph — creates file-to-file edges based on shared symbols (imports that resolve to definitions in other files)
Rank with PageRank — scores files by importance in the dependency graph
Select top N — when --max-files is set, keeps only the highest-ranked files
Encode to TOON — serializes the repo map into the compact output format

Parsing runs concurrently across all available CPU cores.

Supported languages

Python, Go, Ruby. Extensible by adding a tree-sitter grammar and a .scm query file to internal/lang/queries/.

Development

make build    # build binary
make test     # run tests
make lint     # run golangci-lint
make fmt      # format with goimports
make cover    # generate coverage report

License

MIT

Documentation ¶

Overview ¶

repoguide generates a tree-sitter repository map in TOON format.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
internal
discover Package discover finds parseable source files in a repository.	Package discover finds parseable source files in a repository.
graph Package graph builds a dependency graph and computes PageRank.	Package graph builds a dependency graph and computes PageRank.
lang Package lang provides a language registry mapping file extensions to tree-sitter languages and their embedded query files.	Package lang provides a language registry mapping file extensions to tree-sitter languages and their embedded query files.
model Package model defines core data structures for repoguide.	Package model defines core data structures for repoguide.
parse Package parse extracts tags from source files using tree-sitter.	Package parse extracts tags from source files using tree-sitter.
ranking Package ranking implements token-budget-aware file selection.	Package ranking implements token-budget-aware file selection.
toon Package toon implements TOON (Token-Oriented Object Notation) encoding.	Package toon implements TOON (Token-Oriented Object Notation) encoding.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL