leda
Leda is a linked-edge dependency analyzer that provides context isolation for LLM tools. Given a codebase and a natural-language prompt, it builds a directed dependency graph from source files, seeds entry points from the prompt, and traverses the graph to return only the files the LLM actually needs.
By deterministically tracing dependency paths rather than relying on vector similarity, leda avoids the irrelevant results common in typical RAG setups — delivering >70% reduction in token usage.
Install
go install github.com/jgabor/leda/cmd/leda@latest
CLI
leda <command> [options]
Commands:
build Build and serialize a dependency graph
query Query the graph with a natural language prompt
stats Print graph statistics
extract Extract structured metadata from a source file
version Print the leda version
Build a graph
leda build --root ./myproject --output .leda --lang go,ts
leda build --root ./myproject --dry-run # preview files without writing
leda build --root ./myproject --format json # machine-readable output
leda build --root ./myproject --exclude '.openchamber,.worktrees' # skip worktree-style sub-trees that .gitignore doesn't cover
--exclude accepts a comma-separated list of glob patterns matched against
both the relative path and the basename of each entry. Use it for any
directory that contains duplicate copies of the project (vendored worktrees,
agent-experiment outputs, dataset mirrors) when those dirs aren't already
covered by .gitignore or leda's built-in DefaultExclude (.git,
node_modules, vendor, .next, __pycache__, .tox, dist, build).
Query with a prompt
leda query --graph .leda "fix the auth middleware"
leda query --graph .leda --format llm --strategy symbol "database connection"
leda query --graph .leda --format json "auth" # structured JSON output
leda query --graph .leda --context narrow "where is TokenEstimate set?" # symbol summaries
The --context flag selects per-file detail. narrow emits a symbol
summary per file (cheap, best for narrow keyword-matchable questions);
wide (default) emits full file contents (best for tracing). medium
(skeletons) is reserved for a future release.
Graph statistics
leda stats --graph .leda # per-file fan-in/out
leda stats --graph .leda --group-by directory # package-level rollup
leda stats --graph .leda --format json
File-level output is the default; the trailing hint line in text mode
suggests --group-by directory when you want the answer aggregated by
parent directory (package). Intra-directory edges are excluded from the
directory-level rollup so the counts reflect cross-package dependencies
only. All commands support --format json for agent-friendly output.
Agent usage
Leda is built for LLM agents working in a shell. The value prop in one line:
one leda query call replaces several grep/read cycles when exploring
unfamiliar code. Instead of iteratively narrowing via grep and reading each
hit, an agent gets back a curated, dependency-ordered file list in a single
shot.
Recommended invocation pattern:
leda build --root . # once per session
leda query --graph .leda "how does the auth middleware validate tokens?"
Use leda query first when the task is "trace how X works", "find where Y
is wired up", or "what files do I need to change Z?" — anywhere multi-round
grep/read exploration would otherwise be the default. Fall back to native
Read/Glob/Grep only when the exact file is already known or when leda
returns nothing useful for a prompt.
For narrow keyword-matchable questions ("where is X defined?", "which file
exports Y?"), add --context narrow to get a per-file symbol summary
instead of the full file list. Narrow mode often answers the question
without any follow-up Read. Pair with --strategy symbol when the prompt
is an exact identifier name, so the seeder matches against symbols rather
than filenames.
--format json yields structured output suitable for piping into other
shell tooling or parsing inside an agent loop.
Supported languages
All parsers use tree-sitter for accurate AST-based import and symbol extraction.
| Language |
CLI alias |
Import resolution |
| Go |
go |
Go module + relative |
| TypeScript |
ts |
Relative |
| JavaScript |
js |
Relative |
| Python |
py |
Relative |
| Rust |
rs |
Relative |
| Java |
java |
Relative |
| C |
c |
Relative |
| C++ |
cpp |
Relative |
| Ruby |
rb |
Relative |
| PHP |
php |
Relative |
New languages can be added by defining a langConfig in internal/parser/languages.go with tree-sitter query patterns.
Project structure
cmd/leda/ CLI (build, query, stats, extract, version)
internal/
leda/ Graph building, context isolation, seeding
parser/ Tree-sitter parsers (imports + symbols)
resolve/ Import path → file resolution
testdata/ Integration test fixtures
How it works
- Build: Walk the project tree, parse each file for symbols and imports, construct a directed graph where edges represent dependencies.
- Seed: Tokenize the prompt, split identifiers on camelCase/snake_case boundaries, and match against filenames, symbols, or paths to find entry-point nodes.
- Isolate: From seed nodes, traverse the graph (descendants for single seeds, shortest paths + descendants for multiple seeds) to collect the relevant subgraph.
- Budget: Optionally cap results by file count or estimated token count.
Acknowledgements
Leda is inspired by graph-oriented-generation.
License
MIT
Author
Jonathan Gabor (@jgabor)