semanticfp

package
v0.1.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 1, 2026 License: MIT Imports: 14 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var DefaultLiteralPolicy = LiteralPolicy{
	AbstractControlFlowComparisons: true,
	KeepSmallIntegerIndices:        true,
	KeepReturnStatusValues:         true,
	SmallIntMin:                    -16,
	SmallIntMax:                    16,
	AbstractOtherTypes:             true,
}

DefaultLiteralPolicy is the standard policy for fingerprinting. It abstracts most literals, including strings and large numbers, but preserves small integers used in common contexts like array indexing and return codes, making the fingerprint resilient to minor refactoring while retaining key semantics.

View Source
var KeepAllLiteralsPolicy = LiteralPolicy{
	AbstractControlFlowComparisons: false,
	KeepSmallIntegerIndices:        true,
	KeepReturnStatusValues:         true,
	SmallIntMin:                    math.MinInt64,
	SmallIntMax:                    math.MaxInt64,
	AbstractOtherTypes:             false,
}

KeepAllLiteralsPolicy is a policy designed for testing or exact matching. It disables most abstractions, causing the canonical form to retain almost all literal values. This results in a fingerprint that is highly sensitive to any change in constants.

Functions

func BuildSSAFromPackages

func BuildSSAFromPackages(initialPkgs []*packages.Package) (*ssa.Program, *ssa.Package, error)

BuildSSAFromPackages takes a set of loaded Go packages and constructs their Static Single Assignment (SSA) form. SSA is a low-level intermediate representation that is ideal for program analysis, as it makes data flow explicit. This function returns the complete SSA program and the specific SSA package corresponding to the primary package of interest.

Types

type Canonicalizer

type Canonicalizer struct {
	Policy     LiteralPolicy
	StrictMode bool
	// contains filtered or unexported fields
}

Canonicalizer is responsible for transforming an SSA function into a deterministic, canonical string representation. It normalizes register names, block labels, and the order of commutative operations and block traversal to ensure that semantically equivalent functions produce identical string outputs.

func NewCanonicalizer

func NewCanonicalizer(policy LiteralPolicy) *Canonicalizer

NewCanonicalizer creates a new instance of the Canonicalizer with a given literal abstraction policy.

func (*Canonicalizer) CanonicalizeFunction

func (c *Canonicalizer) CanonicalizeFunction(fn *ssa.Function) string

CanonicalizeFunction is the main entry point for the canonicalization process. It takes an SSA function, performs a deterministic traversal of its control flow graph, and processes each instruction to generate a stable, comparable string representation.

type FingerprintResult

type FingerprintResult struct {
	FunctionName string
	Fingerprint  string
	CanonicalIR  string
	Pos          token.Pos // The position of the 'func' keyword, for precise AST matching.
}

FingerprintResult encapsulates the output of the semantic fingerprinting process for a single function. It includes the function's name, its semantic fingerprint (a hash), the canonical intermediate representation (IR) from which the hash was derived, and the function's position in the source code.

func FingerprintPackages

func FingerprintPackages(initialPkgs []*packages.Package, policy LiteralPolicy, strictMode bool) ([]FingerprintResult, error)

FingerprintPackages is the most efficient entry point for fingerprinting when the Go packages have already been loaded by the calling application. It takes the loaded packages, builds their SSA representation, and generates fingerprints for all non-synthetic functions.

func FingerprintSource

func FingerprintSource(filename string, src string, policy LiteralPolicy) ([]FingerprintResult, error)

FingerprintSource is a high-level entry point for fingerprinting a single Go source file provided as a string. It handles the parsing, type-checking, and SSA construction before generating fingerprints for all functions within the source. It is best suited for analyzing isolated snippets, such as diff hunks.

func FingerprintSourceAdvanced

func FingerprintSourceAdvanced(filename string, src string, policy LiteralPolicy, strictMode bool) ([]FingerprintResult, error)

FingerprintSourceAdvanced is an extended version of `FingerprintSource` that provides additional control over the fingerprinting process, such as enabling a strict mode that will panic on unhandled SSA instructions.

func GenerateFingerprint

func GenerateFingerprint(fn *ssa.Function, policy LiteralPolicy, strictMode bool) FingerprintResult

GenerateFingerprint is the core function that computes the semantic fingerprint for a single SSA function. It first normalizes the function's control flow, then generates a canonical string representation of its IR, and finally hashes this string to produce the fingerprint.

type LiteralPolicy

type LiteralPolicy struct {
	AbstractControlFlowComparisons bool  // If true, abstracts literals used in `if` conditions.
	KeepSmallIntegerIndices        bool  // If true, preserves small integers used as array/slice indices.
	KeepReturnStatusValues         bool  // If true, preserves small integers used in `return` statements.
	SmallIntMin                    int64 // The minimum value for an integer to be considered "small".
	SmallIntMax                    int64 // The maximum value for an integer to be considered "small".
	AbstractOtherTypes             bool  // If true, abstracts non-integer literals like strings and floats.
}

LiteralPolicy defines a configurable strategy for determining which literal values (e.g., numbers, strings) should be abstracted into placeholders during the canonicalization of SSA form. This allows fingerprinting to focus on program structure and logic rather than specific data values.

func (*LiteralPolicy) ShouldAbstract

func (p *LiteralPolicy) ShouldAbstract(c *ssa.Const, usageContext ssa.Instruction) bool

ShouldAbstract is the core logic of the policy. It decides whether a given constant (`ssa.Const`) should be abstracted into a placeholder based on its type, value, and the instruction in which it is used (`usageContext`).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL