leda

module
v0.6.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 17, 2026 License: MIT

README

leda

Leda is a linked-edge dependency analyzer that provides context isolation for LLM tools. Given a codebase and a natural-language prompt, it builds a directed dependency graph from source files, seeds entry points from the prompt, and traverses the graph to return only the files the LLM actually needs.

Benchmarked across Go CLI tools, TypeScript monorepos, and Go libraries (6 PASS runs, 3 repos), leda reduces agent session cost by 18-41% (median 27%) and output tokens by 8-20% (median 16%) compared to unassisted grep/read exploration. Graph-topology questions (e.g. "most-depended-on files") see the largest gains (100% exploration reduction in every hot-paths run measured); orientation-style questions on small repos sometimes show modest benefit because the agent can answer from project structure alone.

The leda prime GUIDANCE blob — what SessionStart hooks and OpenCode plugins inject to teach an agent to route through leda — was compressed by 37% (3787 → 2398 bytes) in 0.6.1. A B36-comparable docs-corpus run on the agentera target held adoption at 15/15 with-sessions invoking leda (vs. B36 baseline 16/16) while saving an additional -10.48% loop tokens on the 15 matched with-sessions. The persuasion stack (Authority opener, Commitment announcement clause, three exact invocations, FALLBACK and MUST-NOTs sections) is preserved; redundant SEQUENCING/FAILURE-MODES restatements and duplicated benchmark statistics were the only material removed.

Install

go install github.com/jgabor/leda/cmd/leda@latest

CLI

leda <command> [options]

Commands:
  build    Build and serialize a dependency graph
  query    Query the graph with a natural language prompt
  stats    Print graph statistics
  extract  Extract structured metadata from a source file
  version  Print the leda version
Build a graph
leda build --root ./myproject --output .leda --lang go,ts
leda build --root ./myproject --dry-run              # preview files without writing
leda build --root ./myproject --format json           # machine-readable output
leda build --root ./myproject --exclude '.openchamber,.worktrees'  # skip worktree-style sub-trees that .gitignore doesn't cover

--exclude accepts a comma-separated list of glob patterns matched against both the relative path and the basename of each entry. Use it for any directory that contains duplicate copies of the project (vendored worktrees, agent-experiment outputs, dataset mirrors) when those dirs aren't already covered by .gitignore or leda's built-in DefaultExclude (.git, node_modules, vendor, .next, __pycache__, .tox, dist, build).

Query with a prompt
leda query "fix the auth middleware"
leda query --format llm --strategy symbol "database connection"
leda query --format json "auth"                        # structured JSON output
leda query --context narrow "where is TokenEstimate set?"   # symbol summaries

leda query auto-builds the graph if .leda doesn't exist; pass --root <dir> only when invoking leda from outside the target project.

The --context flag selects per-file detail. narrow emits a symbol summary per file (cheap, best for narrow keyword-matchable questions); wide (default) emits full file contents (best for tracing). medium (skeletons) is reserved for a future release.

Graph statistics
leda stats --graph .leda                              # per-file fan-in/out
leda stats --graph .leda --group-by directory         # package-level rollup
leda stats --graph .leda --format json

File-level output is the default; the trailing hint line in text mode suggests --group-by directory when you want the answer aggregated by parent directory (package). Intra-directory edges are excluded from the directory-level rollup so the counts reflect cross-package dependencies only. All commands support --format json for agent-friendly output.

Agent usage

Leda is built for LLM agents working in a shell. Benchmarks show agents using leda reduce session cost by 27% (median) across 6 PASS runs on 3 repos.

ALWAYS run leda query first for structural codebase questions. One call replaces 5-10 grep/read rounds:

leda query "how does the auth middleware validate tokens?"

leda query auto-builds the graph if .leda doesn't exist. For large codebases or repeated queries, run leda build once to cache it.

Routing rules:

  1. Structural questions ("trace how X works", "what files handle Y"): leda query '<prompt>'
  2. Dependency/centrality ("most-depended-on files", "hot paths"): leda stats, then --group-by directory for package view
  3. Exact identifier lookups ("where is X defined?"): leda query --strategy symbol --context narrow '<name>'

Fall back to native Read/Glob/Grep only when the exact file is already known or when leda returns nothing useful.

--format json yields structured output suitable for piping into other shell tooling or parsing inside an agent loop.

Wiring leda into your agent

Agents only use leda if their system prompt tells them it exists. Add the following to your project's CLAUDE.md or AGENTS.md:

## leda

ALWAYS run `leda query` first for structural codebase questions. Benchmarks
show a 27% median reduction in session cost across 6 PASS runs on 3 repos.

```bash
leda query "how does X work?"
```

- `leda query` auto-builds the graph if `.leda` doesn't exist.
- For "most-depended-on files" questions: `leda stats`
- For "where is X defined?": `leda query --strategy symbol --context narrow 'X'`
- Fall back to Read/Grep ONLY when the exact file is already known.

For non-Claude agents, inject the same guidance via whatever system prompt mechanism your framework provides (e.g. --append-system-prompt for CLI wrappers, or the system field in API calls).

Adoption: hooks and plugins

Instead of hand-editing CLAUDE.md / AGENTS.md for each project, you can install a single integration that primes your agent with leda's routing rules automatically on every session — including re-injection after context compaction, when the original system prompt is otherwise lost.

Both integrations shell out to leda prime (the subcommand that prints the GUIDANCE blob) and carry a defensive activation guard: if leda is not on PATH or the working directory is not a leda-shaped project (no .leda, go.mod, package.json, pyproject.toml, Cargo.toml, requirements.txt, tsconfig.json, deno.json, pom.xml, build.gradle, or Gemfile), the integration no-ops silently — no injection, no error, no noise.

Manifests live under integrations/<runtime>/ so new runtimes follow an obvious layout convention.

Claude Code

The integrations/claude-code/leda-prime.json fragment registers a SessionStart hook whose matcher is "startup|resume|clear|compact". Any stdout the hook command produces is injected as context at session start, and because compact is one of the matched sources, the hook fires again after compaction — re-injecting the GUIDANCE that compaction would otherwise discard. (PreCompact stdout is not auto-injected per Claude Code's docs, so SessionStart with a compact-matching regex is the documented re-injection path.)

What does this do? On every fresh Claude Code session, on /clear, on --resume, and on post-compact resume, the hook runs command -v leda, checks for a leda-shaped project marker, runs leda prime, and prints the GUIDANCE to stdout. Claude Code inserts that stdout into the model's initial context. Your agent sees leda's routing rules without you pasting them into any file.

Install (idempotent — re-running is a no-op):

mkdir -p ~/.claude
python3 -c '
import json, pathlib
dst = pathlib.Path.home() / ".claude" / "settings.json"
src = json.load(open("integrations/claude-code/leda-prime.json"))
cur = json.load(open(dst)) if dst.exists() else {}
existing = cur.setdefault("hooks", {}).setdefault("SessionStart", [])
for entry in src["hooks"]["SessionStart"]:
    if not any("leda prime" in x.get("command", "") for x in entry["hooks"]):
        continue
    if any(any("leda prime" in x.get("command", "") for x in e.get("hooks", [])) for e in existing):
        print("already installed; skipping", dst); break
    existing.append(entry)
else:
    dst.write_text(json.dumps(cur, indent=2) + "\n")
    print("installed into", dst)
'

(Or, for project-scoped install, write to .claude/settings.json in your repo root instead of ~/.claude/settings.json.)

Uninstall:

python3 -c '
import json, pathlib
dst = pathlib.Path.home() / ".claude" / "settings.json"
cur = json.load(open(dst))
cur.get("hooks", {})["SessionStart"] = [
    h for h in cur.get("hooks", {}).get("SessionStart", [])
    if not any("leda prime" in x.get("command", "") for x in h.get("hooks", []))
]
dst.write_text(json.dumps(cur, indent=2) + "\n")
print("uninstalled from", dst)
'

Verify install: start a fresh Claude Code session in a leda-shaped project and ask the agent "what tools do you have available for structural codebase questions?" — the reply should mention leda query and the routing rules from leda prime. You can also run the hook command directly to see what Claude Code will see:

cd path/to/a/leda-shaped/project
sh -c "$(python3 -c 'import json; print(json.load(open(\"integrations/claude-code/leda-prime.json\"))[\"hooks\"][\"SessionStart\"][0][\"hooks\"][0][\"command\"])')"

Empty output means the defensive guard tripped (no leda binary, or no project marker). Non-empty output starting with ALWAYS run leda FIRST means the hook is wired correctly.

OpenCode

The integrations/opencode/leda-prime-plugin.ts plugin hooks into two OpenCode extension points: experimental.chat.system.transform (fires on every chat session start and appends GUIDANCE to output.system) and experimental.session.compacting (fires after context compaction and appends GUIDANCE to output.context, so routing rules survive the compaction boundary).

What does this do? On every new OpenCode chat session, and again on every compaction, the plugin runs its defensive guard, shells out to leda prime, and prepends the GUIDANCE blob to OpenCode's system surface. Your agent sees leda's routing rules on turn one and keeps seeing them across compactions.

Install (project-local):

mkdir -p .opencode/plugin
cp integrations/opencode/leda-prime-plugin.ts .opencode/plugin/leda-prime-plugin.ts

Or user-global:

mkdir -p ~/.config/opencode/plugin
cp integrations/opencode/leda-prime-plugin.ts ~/.config/opencode/plugin/leda-prime-plugin.ts

Uninstall:

rm -f .opencode/plugin/leda-prime-plugin.ts ~/.config/opencode/plugin/leda-prime-plugin.ts

Verify install: run the plugin's own smoke test (requires bun and a local leda build on PATH):

go build -o ./build/leda ./cmd/leda
PATH="$PWD/build:$PATH" bun run integrations/opencode/leda-prime-plugin.smoke.ts

All 9 assertions should print [PASS] across the three scenarios (leda-shaped + leda on PATH → GUIDANCE injected; no project marker → silent no-op; leda absent from PATH → silent no-op).

Supported languages

All parsers use tree-sitter for accurate AST-based import and symbol extraction.

Language CLI alias Import resolution
Go go Go module + relative
TypeScript ts Relative
JavaScript js Relative
Python py Relative
Rust rs Relative
Java java Relative
C c Relative
C++ cpp Relative
Ruby rb Relative
PHP php Relative

New languages can be added by defining a langConfig in internal/parser/languages.go with tree-sitter query patterns.

Project structure

cmd/leda/           CLI (build, query, stats, extract, version)
internal/
  leda/             Graph building, context isolation, seeding
  parser/           Tree-sitter parsers (imports + symbols)
  resolve/          Import path → file resolution
testdata/           Integration test fixtures

How it works

  1. Build: Walk the project tree, parse each file for symbols and imports, construct a directed graph where edges represent dependencies.
  2. Seed: Tokenize the prompt, split identifiers on camelCase/snake_case boundaries, and match against filenames, symbols, or paths to find entry-point nodes.
  3. Isolate: From seed nodes, traverse the graph (descendants for single seeds, shortest paths + descendants for multiple seeds) to collect the relevant subgraph.
  4. Budget: Optionally cap results by file count or estimated token count.

Acknowledgements

Leda is inspired by graph-oriented-generation.

License

MIT

Author

Jonathan Gabor (@jgabor)

Directories

Path Synopsis
cmd
leda command
Command leda builds dependency graphs and isolates context for LLM tools.
Command leda builds dependency graphs and isolates context for LLM tools.
internal
leda
Package leda provides dependency-graph context isolation for LLM tools.
Package leda provides dependency-graph context isolation for LLM tools.
parser
Package parser provides interfaces and implementations for extracting import dependencies and exported symbols from source files.
Package parser provides interfaces and implementations for extracting import dependencies and exported symbols from source files.
resolve
Package resolve turns raw import strings into absolute file paths.
Package resolve turns raw import strings into absolute file paths.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL