index

package
v0.23.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 21, 2026 License: MIT Imports: 18 Imported by: 0

Documentation

Overview

Package index builds and maintains the symbol graph that powers mdsmith's LSP navigation methods (documentSymbol, definition, references, workspace/symbol, callHierarchy).

The graph stores four kinds of symbols — headings, link-reference definitions, top-level front-matter keys, and directives — together with the inbound/outbound reference edges that connect them across files: anchor links, file links, reference-style links, and the include / catalog / build directive targets.

Build is workspace-wide; updates are per-file. Callers re-parse one buffer with Update on document events and rebuild the whole index when the project's `.mdsmith.yml` changes (kind / ignore globs may shift scope).

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func NormalizePath

func NormalizePath(path string) string

NormalizePath returns path with forward slashes and no leading `./`. Empty input passes through. Backslashes are translated even on platforms where filepath.ToSlash is a no-op so a Windows-style path landing in the index from a cross-platform test still keys against the same slot as the slashed form.

func RefDefRegexpMatches

func RefDefRegexpMatches(body []byte) [][]int

RefDefRegexpMatches returns the same submatch indices refDefRE.FindAllSubmatchIndex produces for body. Exported so the LSP rename surface can iterate every reference definition without duplicating the regex pattern (and without giving callers a way to mutate the package-level pattern).

func SafeURLPathEscape

func SafeURLPathEscape(s string) string

SafeURLPathEscape applies net/url.PathEscape to s. PathEscape percent-encodes every byte that would be unsafe in a URL path segment — spaces, slashes, `%`, and a long list of reserved / non-ASCII characters — so the result is guaranteed safe to drop into a `file://` URI fragment without further encoding. Despite the name, the function is not limited to "%" sequences; callers that need a more permissive encoding should reach for QueryEscape directly.

Types

type CompletionContext

type CompletionContext struct {
	Tag CompletionTag
	// Prefix is the partial text typed so far.
	Prefix string
	// TargetFile is the workspace-relative target file for CompletionAnchorOtherFile.
	TargetFile string
	// DirectiveName is the directive name for CompletionDirectivePath.
	DirectiveName string
	// DirectiveArg is the directive argument key for CompletionDirectivePath.
	DirectiveArg string
	// FrontMatterKey is "kind" or "kinds" for CompletionKindValue.
	FrontMatterKey string
}

CompletionContext is the result of Locator.CompletionContext.

type CompletionTag

type CompletionTag int

CompletionTag classifies the context in which a completion was triggered.

const (
	// CompletionNone means the cursor is not in a completable context.
	CompletionNone CompletionTag = iota
	// CompletionAnchorCurrentFile means the cursor is inside [text](#prefix.
	CompletionAnchorCurrentFile
	// CompletionAnchorOtherFile means the cursor is inside [text](path.md#prefix.
	CompletionAnchorOtherFile
	// CompletionRefLabel means the cursor is inside [text][prefix.
	CompletionRefLabel
	// CompletionKindValue means the cursor is on a front-matter kind:/kinds: value.
	CompletionKindValue
	// CompletionDirectivePath means the cursor is on a directive file/source/glob arg.
	CompletionDirectivePath
)

type Edge

type Edge struct {
	SourceFile   string
	SourceLine   int // 1-based
	SourceCol    int // 1-based
	TargetFile   string
	TargetAnchor string
	TargetLabel  string
	Kind         EdgeKind
	Unresolved   bool
}

Edge records one reference from a source position to a target.

Empty TargetFile means "same file as Source" (used for anchor and reference-style links). Empty TargetAnchor means the reference targets the file as a whole (e.g. `[text](./other.md)`).

Unresolved is set on edges whose target shape is a glob pattern (catalog directives) rather than a single file. Reverse-edge queries (IncomingEdges / BacklinksFor) skip unresolved edges so catalog directives don't surface as phantom self-backlinks the way empty-TargetFile placeholders did before plan 153.

type EdgeKind

type EdgeKind int

EdgeKind enumerates the kinds of references the index tracks.

const (
	// EdgeAnchorLink is `[text](#anchor)` — same-file heading reference.
	EdgeAnchorLink EdgeKind = iota
	// EdgeFileLink is `[text](./other.md)` (with optional anchor).
	EdgeFileLink
	// EdgeRefLink is `[text][label]` — reference-style link use.
	EdgeRefLink
	// EdgeInclude is a `<?include file: …?>` directive.
	EdgeInclude
	// EdgeCatalog is a `<?catalog?>` directive.
	EdgeCatalog
	// EdgeBuild is a `<?build source: …?>` directive.
	EdgeBuild
)

type FileEntry

type FileEntry struct {
	// Path is the workspace-relative path with forward slashes.
	Path string
	// Symbols are this file's symbols, in document order.
	Symbols []Symbol
	// Outgoing are the references this file emits.
	Outgoing []Edge
	// Title is the front-matter `title:` value if set, "" otherwise.
	Title string
	// Kinds are the front-matter `kinds:` values if set.
	Kinds []string
	// LineCount is the number of source lines (1-based-inclusive
	// upper bound for symbol ranges). Used to bound heading ranges.
	LineCount int
}

FileEntry is one file's contribution to the index.

type Index

type Index struct {
	// contains filtered or unexported fields
}

Index is the workspace-wide symbol graph. Methods are safe to call concurrently with each other; concurrent Update/Remove on the same path is serialized internally.

func New

func New(root string) *Index

New returns an empty Index rooted at root. Build populates it.

func (*Index) AbsPathToWorkspace

func (i *Index) AbsPathToWorkspace(abs string) string

AbsPathToWorkspace returns the workspace-relative form of abs given the index's root directory. When abs is already relative, or when root is empty, the input is returned unchanged.

func (*Index) BacklinksFor

func (i *Index) BacklinksFor(file string) []Edge

BacklinksFor returns every workspace edge whose target is file, regardless of anchor. Use this for the "what cites this file?" question — IncomingEdges(file, anchor) answers the narrower "what targets this specific heading".

IncomingEdges already drops Unresolved edges (catalog directives whose glob pattern hasn't been expanded) so they don't surface here as phantom self-backlinks on every catalog host file.

Same-file citations (EdgeAnchorLink, EdgeRefLink) stay in the result so callers can filter on SourceFile when they want only external citations. The returned slice is freshly allocated and sorted by (SourceFile, SourceLine, SourceCol) so callers presenting the result to a user — or asserting on it in a test — see a stable order regardless of the underlying map iteration.

func (*Index) Build

func (i *Index) Build(files []string, load func(path string) ([]byte, error))

Build walks the workspace and indexes every Markdown file the supplied loader yields. The loader is called once per discovered path; returning an error skips that file. files is the list of workspace-relative paths to index, typically produced by discovery.Discover and then made workspace-relative.

Build replaces the entire current index, including evicting any entries whose path no longer appears in files.

Build fans out the per-file extractor across runtime.GOMAXPROCS(0) worker goroutines so a multi-thousand-file workspace lands in the graph in roughly wall-clock / cpu-cores time. The extractor itself is pure given (path, bytes); the only shared state is the result map, which a single collector goroutine drains. The supplied loader is called concurrently — callers whose loader is not safe for concurrent calls must serialise inside it or fall back to BuildSerial.

func (*Index) BuildSerial

func (i *Index) BuildSerial(files []string, load func(path string) ([]byte, error))

BuildSerial is the single-threaded variant of Build. Use this when the loader is not safe for concurrent calls.

func (*Index) File

func (i *Index) File(path string) (*FileEntry, bool)

File returns a snapshot of the FileEntry for the given workspace- relative path. The pointer is to a copy so callers may read the slices without holding the index lock; the slices themselves are shared, so callers must not mutate them.

func (*Index) Files

func (i *Index) Files() []string

Files returns a snapshot of the indexed file paths in arbitrary order. Callers must not retain the slice across mutations of the index.

func (*Index) FilesByKind

func (i *Index) FilesByKind(kind string) []string

FilesByKind returns workspace files whose front-matter `kinds:` list contains kind. Order is undefined.

func (*Index) IncomingEdges

func (i *Index) IncomingEdges(file, anchor string) []Edge

IncomingEdges returns every workspace edge whose target is the given (file, anchor). When anchor is "" matches edges to the file at large (no anchor specified by the caller).

Unresolved edges (catalog directives whose glob hasn't been expanded) are skipped — they don't yet point at a specific file, so they can't satisfy a (file, anchor) match. Treating their empty TargetFile as "same file" the way concrete same-file edges are treated would misattribute them as phantom self-backlinks (see plan 153 for the unification that introduced the flag).

The returned slice is a fresh copy.

func (*Index) OutgoingEdges

func (i *Index) OutgoingEdges(file string) []Edge

OutgoingEdges returns the edges originating in file.

func (*Index) Remove

func (i *Index) Remove(path string)

Remove drops the entry for path. No-op when absent.

func (*Index) Root

func (i *Index) Root() string

Root returns the workspace root the index was built against.

func (*Index) SearchSymbols

func (i *Index) SearchSymbols(query string, max int) []SymbolMatch

SearchSymbols returns symbols whose name (case-insensitive) contains query. Match scope:

  • heading text
  • link-ref labels
  • front-matter title (matched against the file's Title)
  • kind names from kinds:

Returns at most max entries (0 = unlimited).

func (*Index) Update

func (i *Index) Update(path string, source []byte)

Update re-parses source under path and replaces the FileEntry. When source is empty the file is removed entirely (matches the case where the file was deleted from disk).

path must be workspace-relative. AbsPathToWorkspace is provided as a helper for callers that hold an absolute filesystem path.

func (*Index) UpdateWithKinds

func (i *Index) UpdateWithKinds(path string, source []byte, kinds []string)

UpdateWithKinds is Update plus an override for the file's effective kinds list. Callers pass the resolved (front-matter ∪ kind- assignment) list so workspace-symbol search and `kind:` navigation see config-driven assignments, not just front-matter declarations. When kinds is nil the result is identical to Update.

type LocateResult

type LocateResult struct {
	Tag TokenTag

	// Heading: the slug.
	Anchor string
	// Heading: the level.
	Level int
	// Heading text (the displayed name).
	Name string

	// File link target (workspace-relative, from the cursor's host
	// file). Empty for anchor-only or ref-style links.
	TargetFile string
	// File link / anchor link: the target's heading anchor (slug).
	TargetAnchor string

	// Reference link / def label, normalized.
	Label string

	// Directive name when on a TokenDirectiveArg.
	DirectiveName string
	// Directive argument key (e.g. "file" or "source").
	DirectiveArg string
	// Directive argument value (raw, untrimmed).
	DirectiveValue string
	// DirectiveTargetFile is the raw `file:` (for include) or
	// `source:` (for build) value the cursor sits on, copied
	// verbatim from the directive body. It is *not* resolved
	// against the host file's directory — the LSP layer pipes it
	// through index.ResolveRelTarget, which applies the same
	// escape-the-root rejection rules as the edge collector.
	DirectiveTargetFile string

	// Front-matter key (when on TokenFrontMatterKey/Value).
	FrontMatterKey string
	// Front-matter value when on TokenFrontMatterValue.
	FrontMatterValue string
}

LocateResult is the payload of a successful Locate call.

type Locator

type Locator struct {
	Path string // workspace-relative path
}

Locator walks one parsed file's source and resolves a 1-based (line, character) position to a TokenTag plus the relevant payload.

Ranges are computed in mdsmith coordinates (1-based lines and columns); LSP-coordinate translation is the LSP layer's job. The returned LocateResult is meant to be self-contained: every field the caller needs to issue a follow-up Lookup is on the result, so the LSP handler does not need to re-parse.

func (Locator) CompletionContext

func (l Locator) CompletionContext(source []byte, line, col int) CompletionContext

CompletionContext determines the completion context at (line, col) in source. line and col are 1-based. Returns a CompletionContext describing the trigger context and the prefix typed so far.

The function operates on whatever bytes the caller hands in, so the LSP layer can call it on the live editor buffer without first landing the change in the index.

func (Locator) Locate

func (l Locator) Locate(source []byte, line, col int) LocateResult

Locate returns the token tag at (line, col) in source. line and col are 1-based; col counts UTF-8 bytes (consistent with the rest of the index). When no specific tag fits, TokenNone is returned.

The function operates on whatever bytes the caller hands in, so the LSP layer can call it on the live editor buffer without first landing the change in the index.

type Symbol

type Symbol struct {
	// File is the workspace-relative path of the containing file
	// (forward slashes, no leading `./`). Index lookups key on this.
	File string
	// Kind is the symbol category.
	Kind SymbolKind
	// Name is the human-readable label (heading text, key, label,
	// directive name).
	Name string
	// Anchor is the normalized identifier used for cross-document
	// lookups: heading slug, link-ref label, or "" for other kinds.
	Anchor string
	// Level is the heading level (1–6) for SymbolHeading; 0 otherwise.
	Level int
	// StartLine, EndLine are 1-based line numbers covering the
	// symbol's full range. For headings the range extends to the
	// next sibling heading; for other kinds it's the source line.
	StartLine int
	EndLine   int
	// SelectionLine, SelectionCol point to the symbol's name/label
	// (1-based) — what an editor highlights when "go to definition"
	// jumps to it.
	SelectionLine int
	SelectionCol  int
}

Symbol is one entry in a file's outline.

type SymbolKind

type SymbolKind int

SymbolKind enumerates the four symbol shapes the index recognizes. Each maps to a specific LSP SymbolKind in the LSP layer; this package keeps the spec-level numbers out of its core types.

const (
	// SymbolHeading is a Markdown heading at any level (H1–H6). The
	// Anchor field carries the slug; the Level field carries the
	// heading level.
	SymbolHeading SymbolKind = iota
	// SymbolLinkRef is a `[label]: url` link-reference definition.
	// The Anchor field carries the normalized label.
	SymbolLinkRef
	// SymbolFrontMatter is a top-level YAML front-matter key. The
	// Name field carries the key.
	SymbolFrontMatter
	// SymbolDirective is a processing-instruction block (<?name … ?>).
	// The Name field carries the directive name.
	SymbolDirective
)

type SymbolMatch

type SymbolMatch struct {
	File   string
	Symbol Symbol
}

SymbolMatch pairs a Symbol with the file that contains it. Returned from workspace-wide queries so callers can build LSP locations.

type TokenTag

type TokenTag int

TokenTag classifies the shape of source token under the cursor. LSP definition / implementation / references handlers branch on this to decide what to resolve.

const (
	// TokenNone is the catch-all "cursor is on plain prose" tag.
	TokenNone TokenTag = iota
	// TokenHeading is the cursor on a heading line.
	TokenHeading
	// TokenAnchorLink is the cursor inside an `[…](#anchor)` link.
	TokenAnchorLink
	// TokenFileLink is the cursor inside an `[…](./other.md…)` link.
	TokenFileLink
	// TokenRefUse is the cursor inside an `[…][label]` reference-style link.
	TokenRefUse
	// TokenRefDef is the cursor inside a `[label]: url` definition.
	TokenRefDef
	// TokenDirectiveArg is the cursor on a directive argument value
	// (e.g. the `file:` value inside `<?include?>`).
	TokenDirectiveArg
	// TokenFrontMatterKey is the cursor on a top-level FM key.
	TokenFrontMatterKey
	// TokenFrontMatterValue is the cursor on a value beside a top-level
	// FM key — currently used only for `kind:` value lookups.
	TokenFrontMatterValue
	// TokenFileTop is the cursor at the first line of the body —
	// the line immediately after any stripped YAML front matter,
	// not necessarily line 1 of the source. The locator strips
	// `---\n…\n---\n` before tagging, so on a file with front
	// matter (line 1, col 1) lands inside the front-matter range
	// and surfaces as TokenFrontMatterKey instead.
	TokenFileTop
)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL