graphfrag

package
v0.6.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 1, 2026 License: GPL-2.0, GPL-2.0-only Imports: 4 Imported by: 0

Documentation

Overview

Package graphfrag is crypto-finder's public contract for reusable component graph fragments: the structural call-graph + rules-versioned crypto annotations that `crypto-finder scan --export-graph-fragment` emits for a single component, plus the pure stitcher that composes a dependency closure of those fragments into root-to-crypto reachability chains.

Why this lives in crypto-finder (not a downstream service): the graph fragment schema and the resolution-quality semantics are crypto-finder's public contract — the scanner produces them, so the rules for consuming them (which edges may extend a chain) belong with the contract owner. A reimplementation in a downstream catalog/mining service would drift the moment the schema bumps. Mirrors the rationale of pkg/stitch.

The package is intentionally dependency-light: it does NOT import the scanner or callgraph builder, read storage, or gzip. Inputs are exported fragments (or decoded Fragments); the caller fetches and decompresses them.

Index

Constants

View Source
const (
	// SuppressReasonUnknown: an edge had no resolution kind (producer bug or an
	// intentionally untrusted edge).
	SuppressReasonUnknown = "unknown_resolution"
	// SuppressReasonNameOnly: a name+arity guess with no receiver type anchor.
	SuppressReasonNameOnly = "name_only"
	// SuppressReasonAmbiguousDispatch: an interface call site with more than one
	// concrete implementation present in the current component's direct dependencies.
	SuppressReasonAmbiguousDispatch = "interface_dispatch_ambiguous"
)

Suppression reasons recorded on a SuppressedEdge.

View Source
const ConfidenceHigh = "high"

ConfidenceHigh is the confidence of every chain emitted under the default fail-closed policy (only exact and unique-implementation interface edges are traversed). The constant exists so a future opt-in mode can surface lower-confidence chains explicitly.

View Source
const SchemaVersion = "graph-fragment-1.1"

SchemaVersion is the current graph-fragment export schema version.

1.1 added per-edge resolution metadata (resolution / declared_type / method_name / arity) on internal_edges and external_calls. The fields are additive: a 1.0 fragment decodes with an empty resolution, which the stitcher treats as untrusted (fail-closed).

Variables

This section is empty.

Functions

This section is empty.

Types

type CallFrame

type CallFrame struct {
	Component ComponentKey
	Function  string
}

CallFrame is one frame in a stitched path.

type ComponentKey

type ComponentKey struct {
	Purl    string
	Version string
}

ComponentKey identifies one mined component version.

func (ComponentKey) String

func (k ComponentKey) String() string

type CryptoOperation

type CryptoOperation struct {
	Function  string
	FindingID string
	RuleID    string
	Symbol    string
}

CryptoOperation is a crypto finding attached to a function.

type DependencyGraph

type DependencyGraph map[ComponentKey][]ComponentKey

DependencyGraph is the authoritative component-version graph resolved from build metadata. Stitching only crosses into components reachable through this graph, even if extra fragments are available in storage.

type ErrMissingFragment

type ErrMissingFragment struct {
	Components []ComponentKey
}

ErrMissingFragment means the dependency closure references components whose graph fragments are absent. The stitcher fails closed instead of returning a partial graph.

func (*ErrMissingFragment) Error

func (e *ErrMissingFragment) Error() string

type ExternalCall

type ExternalCall struct {
	Caller          string
	TargetSignature string

	// Resolution classifies how the producer resolved TargetSignature. The zero
	// value (ResolutionUnknown) is fail-closed: the stitcher will not traverse it.
	Resolution ResolutionKind

	// DeclaredType is the static/interface type observed at the call site (e.g.
	// the interface whose method was dispatched). Provenance plus part of the
	// dispatch-group identity used to detect ambiguous interface dispatch.
	DeclaredType string

	// MethodName and Arity identify the invoked method independently of the
	// resolved target, so sibling candidates of one ambiguous call site can be
	// grouped together.
	MethodName string
	Arity      int

	// CallSite is the source line of the call expression. Together with Caller,
	// MethodName, and Arity it discriminates distinct call sites that happen to
	// share a method name within the same caller.
	CallSite int
}

ExternalCall is a call from this component to a function whose implementation may live in another component from the dependency graph.

type FindingChain

type FindingChain struct {
	FindingID string
	RuleID    string
	Symbol    string
	Frames    []CallFrame

	// Confidence is the weakest-link confidence of the traversed edges. Under
	// the default policy this is always ConfidenceHigh.
	Confidence string
}

FindingChain is one root-to-crypto path.

type Fragment

type Fragment struct {
	Component ComponentKey
	Module    string

	Functions        []Function
	InternalEdges    []InternalEdge
	ExternalCalls    []ExternalCall
	CryptoOperations []CryptoOperation
}

Fragment is one reusable structural graph fragment plus rules-versioned crypto annotations for a single component version. The production storage layer may split this into structural graph blobs and separate crypto annotation blobs, but the stitcher consumes the combined view.

func DecodeFragment

func DecodeFragment(component ComponentKey, data []byte) (Fragment, error)

DecodeFragment parses one graph-fragment export (JSON) into a Fragment for the given component. Legacy fragments exported before the resolution fields existed decode to ResolutionUnknown, which the stitcher fails closed on — safe under-reporting, never a false positive.

type Function

type Function struct {
	Signature string
	FilePath  string
}

Function identifies one callable node inside a component graph.

type GraphFragmentCryptoOp

type GraphFragmentCryptoOp struct {
	FunctionKey string `json:"function_key,omitempty"`
	FindingID   string `json:"finding_id,omitempty"`
	RuleID      string `json:"rule_id,omitempty"`
	Symbol      string `json:"symbol,omitempty"`
	Expression  string `json:"expression,omitempty"`
	FilePath    string `json:"file_path,omitempty"`
	StartLine   int    `json:"start_line,omitempty"`
	EndLine     int    `json:"end_line,omitempty"`
}

GraphFragmentCryptoOp is one crypto finding annotation attached to a function in the exported graph fragment.

type GraphFragmentEdge

type GraphFragmentEdge struct {
	CallerKey    string `json:"caller_key"`
	CalleeKey    string `json:"callee_key"`
	Line         int    `json:"line,omitempty"`
	Resolution   string `json:"resolution"`
	DeclaredType string `json:"declared_type,omitempty"`
	MethodName   string `json:"method_name,omitempty"`
	Arity        int    `json:"arity,omitempty"`
}

GraphFragmentEdge is one internal (intra-component) call edge plus the resolution metadata that lets a consumer decide whether to traverse it.

type GraphFragmentExport

type GraphFragmentExport struct {
	SchemaVersion     string                    `json:"schema_version"`
	ScanMetadata      GraphFragmentScanMetadata `json:"scan_metadata"`
	Functions         []GraphFragmentFunction   `json:"functions"`
	InternalEdges     []GraphFragmentEdge       `json:"internal_edges,omitempty"`
	ExternalCalls     []GraphFragmentExternal   `json:"external_calls,omitempty"`
	CryptoAnnotations []GraphFragmentCryptoOp   `json:"crypto_annotations,omitempty"`
}

GraphFragmentExport is the on-the-wire JSON shape emitted by `crypto-finder scan --export-graph-fragment` for a single component. It is crypto-finder's public contract; the scanner (internal/scan) builds it from a callgraph, and any consumer decodes it into a Fragment via DecodeFragment.

func (GraphFragmentExport) ToFragment

func (e GraphFragmentExport) ToFragment(component ComponentKey) Fragment

ToFragment projects an exported graph fragment onto the stitch model for the given component. The component key is supplied by the caller because the export carries source-level identity (module, function keys) but not the (purl, version) it was requested for.

type GraphFragmentExternal

type GraphFragmentExternal struct {
	CallerKey          string `json:"caller_key"`
	TargetKey          string `json:"target_key"`
	TargetFunctionName string `json:"target_function_name,omitempty"`
	Raw                string `json:"raw,omitempty"`
	Line               int    `json:"line,omitempty"`
	Resolution         string `json:"resolution"`
	DeclaredType       string `json:"declared_type,omitempty"`
	MethodName         string `json:"method_name,omitempty"`
	Arity              int    `json:"arity,omitempty"`
}

GraphFragmentExternal is one external (cross-component) call edge plus its resolution metadata.

type GraphFragmentFunction

type GraphFragmentFunction struct {
	Key                string          `json:"key"`
	FunctionName       string          `json:"function_name"`
	CanonicalSignature string          `json:"canonical_signature,omitempty"`
	Package            string          `json:"package,omitempty"`
	Type               string          `json:"type,omitempty"`
	Name               string          `json:"name,omitempty"`
	FilePath           string          `json:"file_path,omitempty"`
	StartLine          int             `json:"start_line,omitempty"`
	EndLine            int             `json:"end_line,omitempty"`
	ReturnType         string          `json:"return_type,omitempty"`
	ParameterTypes     []string        `json:"parameter_types,omitempty"`
	Visibility         string          `json:"visibility,omitempty"`
	OwnerVisibility    string          `json:"owner_visibility,omitempty"`
	InferredReturn     json.RawMessage `json:"-"`
}

GraphFragmentFunction is one function declaration included in a component's graph-fragment export.

type GraphFragmentScanMetadata

type GraphFragmentScanMetadata struct {
	Ecosystem     string `json:"ecosystem,omitempty"`
	RootModule    string `json:"root_module,omitempty"`
	ToolName      string `json:"tool_name,omitempty"`
	ToolVersion   string `json:"tool_version,omitempty"`
	RulesVersion  string `json:"rules_version,omitempty"`
	ExportedAt    string `json:"exported_at"`
	FunctionCount int    `json:"function_count"`
	InternalEdges int    `json:"internal_edge_count"`
	ExternalCalls int    `json:"external_call_count"`
	CryptoOps     int    `json:"crypto_operation_count"`
}

GraphFragmentScanMetadata summarizes the scan that produced a graph-fragment export and the payload counts emitted for that component.

type InternalEdge

type InternalEdge struct {
	Caller string
	Callee string

	// Resolution classifies how the producer resolved Callee. Zero value
	// (ResolutionUnknown) is fail-closed.
	Resolution ResolutionKind

	// DeclaredType, MethodName, Arity, CallSite mirror ExternalCall and identify
	// the dispatched method / call site for ambiguity grouping.
	DeclaredType string
	MethodName   string
	Arity        int
	CallSite     int
}

InternalEdge connects two functions inside the same component fragment.

Internal edges carry the same resolution metadata as ExternalCall: an interface or fluent call resolved by name+arity can land on a co-located implementation just as easily as a cross-component one, so it must be gated by the same policy. The dispatch-group identity (Caller + CallSite + MethodName + Arity) is shared with ExternalCall so that siblings of one call site that span the component boundary are judged together.

type ResolutionKind

type ResolutionKind string

ResolutionKind classifies how confidently the producer resolved a call to its target. The stitcher uses this to decide whether an edge is allowed to extend a reachability chain. The zero value is ResolutionUnknown, which is treated as untrusted: the stitcher fails closed rather than guessing.

This is the central guard against over-broad dispatch false positives. An interface or fluent call resolved purely by method name + arity (no receiver type anchor) must never be presented as typed reachability proof.

const (
	// ResolutionUnknown is the zero value: the producer did not classify the
	// edge. Treated as untrusted and never traversed. Its presence usually means
	// a producer bug (an edge exported without a resolution kind).
	ResolutionUnknown ResolutionKind = ""

	// ResolutionExact means the receiver's static type was known and the method
	// resolved to a unique declared target on that type (or an overload set on
	// that exact type). Always traversed.
	ResolutionExact ResolutionKind = "exact"

	// ResolutionInterfaceDispatch means the target was found by expanding an
	// interface/abstract method to concrete implementations matching name+arity
	// within a namespace root. Trusted ONLY when exactly one implementation is
	// present in the current component's direct dependencies; ambiguous (>1)
	// call sites fail closed.
	ResolutionInterfaceDispatch ResolutionKind = "interface_dispatch"

	// ResolutionNameOnly means the target was guessed by method name + arity
	// (plus namespace heuristics) with no receiver type anchor — e.g. fluent
	// fallback. Never traversed.
	ResolutionNameOnly ResolutionKind = "name_only"
)

type Result

type Result struct {
	Chains []FindingChain

	// Suppressed records call edges the policy refused to traverse. It is the
	// audit trail for fail-closed decisions and the data source for a future
	// opt-in "show me the uncertain paths too" mode. It never affects Chains.
	Suppressed []SuppressedEdge
}

Result is the stitched reachability output in its minimal semantic form. Rendering into crypto-finder's customer-facing callgraph schema is a later adapter concern.

func Stitch

func Stitch(root ComponentKey, deps DependencyGraph, fragments map[ComponentKey]Fragment) (*Result, error)

Stitch composes reusable component graph fragments into root-to-crypto reachability chains for root.

This is the pure graph algorithm. It deliberately does not know about storage, compression, or HTTP response DTOs.

type SuppressedEdge

type SuppressedEdge struct {
	Caller     CallFrame
	MethodName string
	Arity      int
	Reason     string
	Candidates []ComponentKey
}

SuppressedEdge is one call edge (or grouped call site) the stitcher declined to traverse, with the reason and the candidate targets it would have reached.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL