reachability

package
v3.8.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 29, 2026 License: AGPL-3.0 Imports: 12 Imported by: 0

Documentation

Overview

Package reachability runs tree-sitter S-expression queries supplied by vdb-manager against source files to determine whether a known- vulnerable code pattern is present (direct mode) or reachable from first-party code (transitive mode).

Index

Constants

View Source
const MaxFileSize = 4 * 1024 * 1024 // 4 MiB

MaxFileSize is the largest file the scanner will parse. Files above this threshold are silently skipped to keep memory bounded.

Variables

This section is empty.

Functions

func InstallPath

func InstallPath(projectRoot, ecosystem, pkg string) string

InstallPath attempts to locate the on-disk directory for a given (ecosystem, package) pair starting from projectRoot. The lookup is purely filesystem-based: no package manager is invoked. Returns "" if nothing plausible is found.

Each ecosystem has a canonical install layout. For monorepos or non-standard layouts the caller can still surface transitive matches even when this lookup fails.

Types

type CveSymbols added in v3.6.0

type CveSymbols struct {
	CveID    string
	Routines []string
	Files    []string
	Modules  []string
}

CveSymbols groups the three symbol lists for one CVE.

type Engine

type Engine struct {
	// contains filtered or unexported fields
}

Engine compiles and runs tree-sitter queries against source files. It is safe for concurrent use; parsers are not shared but are pooled per language.

func NewEngine

func NewEngine() *Engine

NewEngine returns a fresh engine.

func (*Engine) Run

func (e *Engine) Run(ctx context.Context, id treesitter.LanguageID, source []byte, queryText string) ([]QueryMatch, error)

Run parses source as the given language and executes queryText against it, returning every top-level match.

type Match

type Match struct {
	File      string            `json:"file"`
	StartLine int               `json:"start_line"`
	EndLine   int               `json:"end_line"`
	Query     string            `json:"query,omitempty"`
	Language  string            `json:"language,omitempty"`
	Captures  map[string]string `json:"captures,omitempty"`
}

Match is one tree-sitter query hit recorded against a file.

func (Match) Range

func (m Match) Range() string

Range renders StartLine:EndLine using the "n:n" convention used in the rest of the CLI's output.

type Mode

type Mode string

Mode selects which scans are performed.

const (
	ModeOff        Mode = "off"
	ModeDirect     Mode = "direct"
	ModeTransitive Mode = "transitive"
	ModeBoth       Mode = "both"
)

func ParseMode

func ParseMode(s string) (Mode, bool)

ParseMode normalises user input. The empty string maps to ModeBoth (the default).

func (Mode) Includes

func (m Mode) Includes(other Mode) bool

Includes reports whether the given mode is active under m.

type QueryMatch

type QueryMatch struct {
	StartLine int
	EndLine   int
	Captures  map[string]string
}

QueryMatch is a single match emitted by Engine.Run before being promoted to a reachability.Match (which adds file context).

type Result

type Result struct {
	Direct     []Match `json:"direct,omitempty"`
	Transitive []Match `json:"transitive,omitempty"`
	// Skipped is populated when a mode was requested but couldn't run,
	// e.g. the package install folder couldn't be located.
	SkippedDirect     string `json:"skipped_direct,omitempty"`
	SkippedTransitive string `json:"skipped_transitive,omitempty"`
	// QueriesRun is the count of distinct query/language pairs executed.
	QueriesRun int `json:"queries_run"`
}

Result is the full reachability output for a single vulnerability.

func Scan

func Scan(ctx context.Context, engine *Engine, req ScanRequest) (*Result, error)

Scan runs the queries against the project. Direct matches come from files inside the installed-package directory; transitive matches come from the rest of the project tree (excluding the install directory and standard build/cache folders).

func (*Result) Empty

func (r *Result) Empty() bool

Empty reports whether no matches were recorded.

type ScanRequest

type ScanRequest struct {
	ProjectRoot string
	Ecosystem   string
	Package     string
	// Queries from vdb-api's GET /vuln/{id}/tree-sitter response.
	Queries []vdb.TreeSitterQuery
	Mode    Mode
}

ScanRequest groups the inputs to one reachability scan.

type SymbolMatch added in v3.6.0

type SymbolMatch struct {
	File   string
	Line   int    // 1-indexed source line of the match (0 for file-name hits)
	Symbol string // the routine/file/module name that matched
	Kind   string // "routine" | "file" | "module"
}

SymbolMatch is one grep hit during the fallback pass. The Reachability label this powers is intentionally named "semantic" — see docs site for the full meaning, but it's the "your code imports/references the affected element by name" signal: lower efficacy than a tree-sitter AST match but a strong indicator the dep is actually used rather than a phantom go.sum entry.

type SymbolMatchRequest added in v3.6.0

type SymbolMatchRequest struct {
	ProjectRoot string
	Inputs      []CveSymbols
}

SymbolMatchRequest is the input to MatchAffectedSymbols. Inputs is a flat slice — one entry per CVE — carrying the symbol lists the server returned. The CLI receives a reverse map cve→[hits] so it can stamp Reachability="grep-match" per CVE.

type SymbolMatchResult added in v3.6.0

type SymbolMatchResult struct {
	HitsByCVE map[string][]SymbolMatch
}

SymbolMatchResult is the cve→hits output.

func MatchAffectedSymbols added in v3.6.0

func MatchAffectedSymbols(ctx context.Context, req SymbolMatchRequest) (*SymbolMatchResult, error)

MatchAffectedSymbols walks projectRoot once and tests every text file against a single compiled OR-regex of every quality-filtered routine/module name across all CVEs. File-name hits use simple Base() suffix matching on the walked path itself (no file-content scan needed). The match is purely literal — no language awareness beyond an extension allowlist that mirrors the tree-sitter scanner's view of "source files we should look at".

Quality threshold for routine + module names:

  • length >= 5 AND
  • contains at least one capital letter, dot, or underscore

Short / lowercase English words like "open" or "parse" would false-positive almost everywhere; the threshold removes them up front. File-name entries are matched verbatim regardless of length — they're path-shaped already.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL