Documentation
¶
Overview ¶
Package index builds and maintains the symbol graph that powers mdsmith's LSP navigation methods (documentSymbol, definition, references, workspace/symbol, callHierarchy).
The graph stores four kinds of symbols — headings, link-reference definitions, top-level front-matter keys, and directives — together with the inbound/outbound reference edges that connect them across files: anchor links, file links, reference-style links, and the include / catalog / build directive targets.
Build is workspace-wide; updates are per-file. Callers re-parse one buffer with Update on document events and rebuild the whole index when the project's `.mdsmith.yml` changes (kind / ignore globs may shift scope).
Index ¶
- func NormalizePath(path string) string
- func RefDefRegexpMatches(body []byte) [][]int
- func SafeURLPathEscape(s string) string
- type CompletionContext
- type CompletionTag
- type Edge
- type EdgeKind
- type FileEntry
- type Index
- func (i *Index) AbsPathToWorkspace(abs string) string
- func (i *Index) BacklinksFor(file string) []Edge
- func (i *Index) Build(files []string, load func(path string) ([]byte, error))
- func (i *Index) BuildSerial(files []string, load func(path string) ([]byte, error))
- func (i *Index) File(path string) (*FileEntry, bool)
- func (i *Index) Files() []string
- func (i *Index) FilesByKind(kind string) []string
- func (i *Index) IncomingEdges(file, anchor string) []Edge
- func (i *Index) OutgoingEdges(file string) []Edge
- func (i *Index) Remove(path string)
- func (i *Index) Root() string
- func (i *Index) SearchSymbols(query string, max int) []SymbolMatch
- func (i *Index) Update(path string, source []byte)
- func (i *Index) UpdateWithKinds(path string, source []byte, kinds []string)
- type LocateResult
- type Locator
- type Symbol
- type SymbolKind
- type SymbolMatch
- type TokenTag
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func NormalizePath ¶
NormalizePath returns path with forward slashes and no leading `./`. Empty input passes through. Backslashes are translated even on platforms where filepath.ToSlash is a no-op so a Windows-style path landing in the index from a cross-platform test still keys against the same slot as the slashed form.
func RefDefRegexpMatches ¶
RefDefRegexpMatches returns the same submatch indices refDefRE.FindAllSubmatchIndex produces for body. Exported so the LSP rename surface can iterate every reference definition without duplicating the regex pattern (and without giving callers a way to mutate the package-level pattern).
func SafeURLPathEscape ¶
SafeURLPathEscape applies net/url.PathEscape to s. PathEscape percent-encodes every byte that would be unsafe in a URL path segment — spaces, slashes, `%`, and a long list of reserved / non-ASCII characters — so the result is guaranteed safe to drop into a `file://` URI fragment without further encoding. Despite the name, the function is not limited to "%" sequences; callers that need a more permissive encoding should reach for QueryEscape directly.
Types ¶
type CompletionContext ¶
type CompletionContext struct {
Tag CompletionTag
// Prefix is the partial text typed so far.
Prefix string
// TargetFile is the workspace-relative target file for CompletionAnchorOtherFile.
TargetFile string
// DirectiveName is the directive name for CompletionDirectivePath.
DirectiveName string
// DirectiveArg is the directive argument key for CompletionDirectivePath.
DirectiveArg string
// FrontMatterKey is "kind" or "kinds" for CompletionKindValue.
FrontMatterKey string
}
CompletionContext is the result of Locator.CompletionContext.
type CompletionTag ¶
type CompletionTag int
CompletionTag classifies the context in which a completion was triggered.
const ( // CompletionNone means the cursor is not in a completable context. CompletionNone CompletionTag = iota // CompletionAnchorCurrentFile means the cursor is inside [text](#prefix. CompletionAnchorCurrentFile // CompletionAnchorOtherFile means the cursor is inside [text](path.md#prefix. CompletionAnchorOtherFile // CompletionRefLabel means the cursor is inside [text][prefix. CompletionRefLabel // CompletionKindValue means the cursor is on a front-matter kind:/kinds: value. CompletionKindValue // CompletionDirectivePath means the cursor is on a directive file/source/glob arg. CompletionDirectivePath )
type Edge ¶
type Edge struct {
SourceFile string
SourceLine int // 1-based
SourceCol int // 1-based
TargetFile string
TargetAnchor string
TargetLabel string
Kind EdgeKind
Unresolved bool
}
Edge records one reference from a source position to a target.
Empty TargetFile means "same file as Source" (used for anchor and reference-style links). Empty TargetAnchor means the reference targets the file as a whole (e.g. `[text](./other.md)`).
Unresolved is set on edges whose target shape is a glob pattern (catalog directives) rather than a single file. Reverse-edge queries (IncomingEdges / BacklinksFor) skip unresolved edges so catalog directives don't surface as phantom self-backlinks the way empty-TargetFile placeholders did before plan 153.
type EdgeKind ¶
type EdgeKind int
EdgeKind enumerates the kinds of references the index tracks.
const ( // EdgeAnchorLink is `[text](#anchor)` — same-file heading reference. EdgeAnchorLink EdgeKind = iota // EdgeFileLink is `[text](./other.md)` (with optional anchor). EdgeFileLink // EdgeRefLink is `[text][label]` — reference-style link use. EdgeRefLink // EdgeInclude is a `<?include file: …?>` directive. EdgeInclude // EdgeCatalog is a `<?catalog?>` directive. EdgeCatalog // EdgeBuild is a `<?build source: …?>` directive. EdgeBuild )
type FileEntry ¶
type FileEntry struct {
// Path is the workspace-relative path with forward slashes.
Path string
// Symbols are this file's symbols, in document order.
Symbols []Symbol
// Outgoing are the references this file emits.
Outgoing []Edge
// Title is the front-matter `title:` value if set, "" otherwise.
Title string
// Kinds are the front-matter `kinds:` values if set.
Kinds []string
// LineCount is the number of source lines (1-based-inclusive
// upper bound for symbol ranges). Used to bound heading ranges.
LineCount int
}
FileEntry is one file's contribution to the index.
type Index ¶
type Index struct {
// contains filtered or unexported fields
}
Index is the workspace-wide symbol graph. Methods are safe to call concurrently with each other; concurrent Update/Remove on the same path is serialized internally.
func (*Index) AbsPathToWorkspace ¶
AbsPathToWorkspace returns the workspace-relative form of abs given the index's root directory. When abs is already relative, or when root is empty, the input is returned unchanged.
func (*Index) BacklinksFor ¶
BacklinksFor returns every workspace edge whose target is file, regardless of anchor. Use this for the "what cites this file?" question — IncomingEdges(file, anchor) answers the narrower "what targets this specific heading".
IncomingEdges already drops Unresolved edges (catalog directives whose glob pattern hasn't been expanded) so they don't surface here as phantom self-backlinks on every catalog host file.
Same-file citations (EdgeAnchorLink, EdgeRefLink) stay in the result so callers can filter on SourceFile when they want only external citations. The returned slice is freshly allocated and sorted by (SourceFile, SourceLine, SourceCol) so callers presenting the result to a user — or asserting on it in a test — see a stable order regardless of the underlying map iteration.
func (*Index) Build ¶
Build walks the workspace and indexes every Markdown file the supplied loader yields. The loader is called once per discovered path; returning an error skips that file. files is the list of workspace-relative paths to index, typically produced by discovery.Discover and then made workspace-relative.
Build replaces the entire current index, including evicting any entries whose path no longer appears in files.
Build fans out the per-file extractor across runtime.GOMAXPROCS(0) worker goroutines so a multi-thousand-file workspace lands in the graph in roughly wall-clock / cpu-cores time. The extractor itself is pure given (path, bytes); the only shared state is the result map, which a single collector goroutine drains. The supplied loader is called concurrently — callers whose loader is not safe for concurrent calls must serialise inside it or fall back to BuildSerial.
func (*Index) BuildSerial ¶
BuildSerial is the single-threaded variant of Build. Use this when the loader is not safe for concurrent calls.
func (*Index) File ¶
File returns a snapshot of the FileEntry for the given workspace- relative path. The pointer is to a copy so callers may read the slices without holding the index lock; the slices themselves are shared, so callers must not mutate them.
func (*Index) Files ¶
Files returns a snapshot of the indexed file paths in arbitrary order. Callers must not retain the slice across mutations of the index.
func (*Index) FilesByKind ¶
FilesByKind returns workspace files whose front-matter `kinds:` list contains kind. Order is undefined.
func (*Index) IncomingEdges ¶
IncomingEdges returns every workspace edge whose target is the given (file, anchor). When anchor is "" matches edges to the file at large (no anchor specified by the caller).
Unresolved edges (catalog directives whose glob hasn't been expanded) are skipped — they don't yet point at a specific file, so they can't satisfy a (file, anchor) match. Treating their empty TargetFile as "same file" the way concrete same-file edges are treated would misattribute them as phantom self-backlinks (see plan 153 for the unification that introduced the flag).
The returned slice is a fresh copy.
func (*Index) OutgoingEdges ¶
OutgoingEdges returns the edges originating in file.
func (*Index) SearchSymbols ¶
func (i *Index) SearchSymbols(query string, max int) []SymbolMatch
SearchSymbols returns symbols whose name (case-insensitive) contains query. Match scope:
- heading text
- link-ref labels
- front-matter title (matched against the file's Title)
- kind names from kinds:
Returns at most max entries (0 = unlimited).
func (*Index) Update ¶
Update re-parses source under path and replaces the FileEntry. When source is empty the file is removed entirely (matches the case where the file was deleted from disk).
path must be workspace-relative. AbsPathToWorkspace is provided as a helper for callers that hold an absolute filesystem path.
func (*Index) UpdateWithKinds ¶
UpdateWithKinds is Update plus an override for the file's effective kinds list. Callers pass the resolved (front-matter ∪ kind- assignment) list so workspace-symbol search and `kind:` navigation see config-driven assignments, not just front-matter declarations. When kinds is nil the result is identical to Update.
type LocateResult ¶
type LocateResult struct {
Tag TokenTag
// Heading: the slug.
Anchor string
// Heading: the level.
Level int
// Heading text (the displayed name).
Name string
// File link target (workspace-relative, from the cursor's host
// file). Empty for anchor-only or ref-style links.
TargetFile string
// File link / anchor link: the target's heading anchor (slug).
TargetAnchor string
// Reference link / def label, normalized.
Label string
// Directive name when on a TokenDirectiveArg.
DirectiveName string
// Directive argument key (e.g. "file" or "source").
DirectiveArg string
// Directive argument value (raw, untrimmed).
DirectiveValue string
// DirectiveTargetFile is the raw `file:` (for include) or
// `source:` (for build) value the cursor sits on, copied
// verbatim from the directive body. It is *not* resolved
// against the host file's directory — the LSP layer pipes it
// through index.ResolveRelTarget, which applies the same
// escape-the-root rejection rules as the edge collector.
DirectiveTargetFile string
// Front-matter key (when on TokenFrontMatterKey/Value).
FrontMatterKey string
// Front-matter value when on TokenFrontMatterValue.
FrontMatterValue string
}
LocateResult is the payload of a successful Locate call.
type Locator ¶
type Locator struct {
Path string // workspace-relative path
}
Locator walks one parsed file's source and resolves a 1-based (line, character) position to a TokenTag plus the relevant payload.
Ranges are computed in mdsmith coordinates (1-based lines and columns); LSP-coordinate translation is the LSP layer's job. The returned LocateResult is meant to be self-contained: every field the caller needs to issue a follow-up Lookup is on the result, so the LSP handler does not need to re-parse.
func (Locator) CompletionContext ¶
func (l Locator) CompletionContext(source []byte, line, col int) CompletionContext
CompletionContext determines the completion context at (line, col) in source. line and col are 1-based. Returns a CompletionContext describing the trigger context and the prefix typed so far.
The function operates on whatever bytes the caller hands in, so the LSP layer can call it on the live editor buffer without first landing the change in the index.
func (Locator) Locate ¶
func (l Locator) Locate(source []byte, line, col int) LocateResult
Locate returns the token tag at (line, col) in source. line and col are 1-based; col counts UTF-8 bytes (consistent with the rest of the index). When no specific tag fits, TokenNone is returned.
The function operates on whatever bytes the caller hands in, so the LSP layer can call it on the live editor buffer without first landing the change in the index.
type Symbol ¶
type Symbol struct {
// File is the workspace-relative path of the containing file
// (forward slashes, no leading `./`). Index lookups key on this.
File string
// Kind is the symbol category.
Kind SymbolKind
// Name is the human-readable label (heading text, key, label,
// directive name).
Name string
// Anchor is the normalized identifier used for cross-document
// lookups: heading slug, link-ref label, or "" for other kinds.
Anchor string
// Level is the heading level (1–6) for SymbolHeading; 0 otherwise.
Level int
// StartLine, EndLine are 1-based line numbers covering the
// symbol's full range. For headings the range extends to the
// next sibling heading; for other kinds it's the source line.
StartLine int
EndLine int
// SelectionLine, SelectionCol point to the symbol's name/label
// (1-based) — what an editor highlights when "go to definition"
// jumps to it.
SelectionLine int
SelectionCol int
}
Symbol is one entry in a file's outline.
type SymbolKind ¶
type SymbolKind int
SymbolKind enumerates the four symbol shapes the index recognizes. Each maps to a specific LSP SymbolKind in the LSP layer; this package keeps the spec-level numbers out of its core types.
const ( // SymbolHeading is a Markdown heading at any level (H1–H6). The // Anchor field carries the slug; the Level field carries the // heading level. SymbolHeading SymbolKind = iota // SymbolLinkRef is a `[label]: url` link-reference definition. // The Anchor field carries the normalized label. SymbolLinkRef // SymbolFrontMatter is a top-level YAML front-matter key. The // Name field carries the key. SymbolFrontMatter // SymbolDirective is a processing-instruction block (<?name … ?>). // The Name field carries the directive name. SymbolDirective )
type SymbolMatch ¶
SymbolMatch pairs a Symbol with the file that contains it. Returned from workspace-wide queries so callers can build LSP locations.
type TokenTag ¶
type TokenTag int
TokenTag classifies the shape of source token under the cursor. LSP definition / implementation / references handlers branch on this to decide what to resolve.
const ( // TokenNone is the catch-all "cursor is on plain prose" tag. TokenNone TokenTag = iota // TokenHeading is the cursor on a heading line. TokenHeading // TokenAnchorLink is the cursor inside an `[…](#anchor)` link. TokenAnchorLink // TokenFileLink is the cursor inside an `[…](./other.md…)` link. TokenFileLink // TokenRefUse is the cursor inside an `[…][label]` reference-style link. TokenRefUse // TokenRefDef is the cursor inside a `[label]: url` definition. TokenRefDef // TokenDirectiveArg is the cursor on a directive argument value // (e.g. the `file:` value inside `<?include?>`). TokenDirectiveArg // TokenFrontMatterKey is the cursor on a top-level FM key. TokenFrontMatterKey // TokenFrontMatterValue is the cursor on a value beside a top-level // FM key — currently used only for `kind:` value lookups. TokenFrontMatterValue // TokenFileTop is the cursor at the first line of the body — // the line immediately after any stripped YAML front matter, // not necessarily line 1 of the source. The locator strips // `---\n…\n---\n` before tagging, so on a file with front // matter (line 1, col 1) lands inside the front-matter range // and surfaces as TokenFrontMatterKey instead. TokenFileTop )