tracer

package
v0.1.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 27, 2026 License: GPL-3.0 Imports: 20 Imported by: 0

Documentation

Index

Constants

View Source
const (
	LabelHTTPGet     = constants.LabelHTTPGet
	LabelHTTPPost    = constants.LabelHTTPPost
	LabelHTTPCookie  = constants.LabelHTTPCookie
	LabelHTTPHeader  = constants.LabelHTTPHeader
	LabelHTTPBody    = constants.LabelHTTPBody
	LabelCLI         = constants.LabelCLI
	LabelEnvironment = constants.LabelEnvironment
	LabelFile        = constants.LabelFile
	LabelDatabase    = constants.LabelDatabase
	LabelNetwork     = constants.LabelNetwork
	LabelUserInput   = constants.LabelUserInput
)

Re-export InputLabel constants for backward compatibility

View Source
const (
	StepAssignment            = constants.StepAssignment
	StepParameterPass         = constants.StepParameterPass
	StepReturn                = constants.StepReturn
	StepInterproceduralReturn = constants.StepInterproceduralReturn
	StepConcatenation         = constants.StepConcatenation
	StepArrayAccess           = constants.StepArrayAccess
	StepObjectAccess          = constants.StepObjectAccess
	StepDestructure           = constants.StepDestructure
)

Re-export PropagationStepType constants for backward compatibility

View Source
const (
	ScopeGlobal   = constants.ScopeGlobal
	ScopeFile     = constants.ScopeFile
	ScopeModule   = constants.ScopeModule
	ScopeClass    = constants.ScopeClass
	ScopeFunction = constants.ScopeFunction
	ScopeBlock    = constants.ScopeBlock
)

Re-export ScopeType constants for backward compatibility

Variables

This section is empty.

Functions

This section is empty.

Types

type AnalysisState

type AnalysisState struct {
	ScopeState                                    // scope management (push/pop, lookup)
	TaintedValues     map[string]*TaintedVariable // variable name -> tainted info (taint tracking)
	FunctionSummaries map[string]*FunctionSummary
	VisitedFunctions  map[string]bool
}

AnalysisState maintains the current state during analysis of a single file.

It embeds ScopeState for scope management (push/pop scopes, hierarchical variable lookup) and adds taint-tracking fields (TaintedValues, FunctionSummaries, VisitedFunctions) as a separate concern. Keeping them in one struct is intentional for this package: both concerns are needed together on every file-analysis pass and separating them into two arguments everywhere would add noise without benefit. The split is expressed structurally via the embedded ScopeState so that the two concerns remain conceptually distinct.

See also: pkg/semantic/types.AnalysisState, which is an unrelated type used by the deep semantic analysis layer. The two types serve different levels of abstraction and should not be merged (see M26 note in types.go).

func NewAnalysisState

func NewAnalysisState() *AnalysisState

NewAnalysisState creates a new analysis state with global scope

func (*AnalysisState) EnterScope

func (s *AnalysisState) EnterScope(scopeType ScopeType, name string, startLine, endLine int) *Scope

EnterScope creates and enters a new scope (delegates to ScopeState).

func (*AnalysisState) ExitScope

func (s *AnalysisState) ExitScope() *Scope

ExitScope exits the current scope and returns to parent (delegates to ScopeState).

func (*AnalysisState) LookupVariable

func (s *AnalysisState) LookupVariable(name string) (*TaintedVariable, bool)

LookupVariable looks up a variable in current and parent scopes, then falls back to the flat TaintedValues map for file-global tainted variables.

func (*AnalysisState) SetTainted

func (s *AnalysisState) SetTainted(name string, tainted *TaintedVariable)

SetTainted marks a variable as tainted in the current scope and in the flat TaintedValues map so that findTaintInfo can reach it via O(1) lookup.

type Argument

type Argument struct {
	Name  string
	Text  string
	Index int
}

Argument represents a function argument

type Config

type Config struct {
	// Languages to analyze (empty = all supported)
	Languages []string

	// Maximum inter-procedural analysis depth
	MaxDepth int

	// Number of parallel workers
	Workers int

	// Custom source definitions (in addition to built-in)
	CustomSources []sources.Definition

	// Skip directories matching these patterns
	SkipDirs []string

	// Include only files matching these patterns (empty = all)
	IncludePatterns []string

	// Verbose enables diagnostic logging to stdout during analysis
	Verbose bool
}

Config configures the tracer

func DefaultConfig

func DefaultConfig() *Config

DefaultConfig returns sensible defaults using centralized sources

func (*Config) Validate

func (c *Config) Validate() error

Validate checks that the config has valid values.

type FlowEdge

type FlowEdge struct {
	From     string       `json:"from"` // Node ID
	To       string       `json:"to"`   // Node ID
	Type     FlowEdgeType `json:"type"`
	Location Location     `json:"location"`
}

FlowEdge connects two nodes showing data flow

type FlowEdgeType

type FlowEdgeType = constants.FlowEdgeType

FlowEdgeType represents how data flows between two nodes. Re-exported from pkg/sources/constants.

const (
	FlowEdgeAssignment FlowEdgeType = constants.EdgeAssignment
	FlowEdgeCall       FlowEdgeType = constants.EdgeCall
	FlowEdgeReturn     FlowEdgeType = constants.EdgeReturn
	FlowEdgeTaint      FlowEdgeType = constants.EdgeDataFlow
	FlowEdgeParameter  FlowEdgeType = constants.EdgeParameter
)

FlowEdge type constants

type FlowGraph

type FlowGraph struct {
	Nodes []FlowNode `json:"nodes"`
	Edges []FlowEdge `json:"edges"`
}

FlowGraph represents the complete input flow graph

type FlowNode

type FlowNode struct {
	ID       string       `json:"id"`
	Type     FlowNodeType `json:"type"`
	Name     string       `json:"name"`
	Location Location     `json:"location"`
}

FlowNode is a node in the flow graph

type FlowNodeType

type FlowNodeType = constants.FlowNodeType

FlowNodeType represents the type of a node in the data flow graph. Re-exported from pkg/sources/constants.

const (
	FlowNodeSource   FlowNodeType = constants.NodeSource
	FlowNodeVariable FlowNodeType = constants.NodeVariable
	FlowNodeFunction FlowNodeType = constants.NodeFunction
	FlowNodeParam    FlowNodeType = constants.NodeParam
	FlowNodeCarrier  FlowNodeType = constants.NodeCarrier
	FlowNodeProperty FlowNodeType = constants.NodeProperty
	FlowNodeReturn   FlowNodeType = constants.NodeReturn
)

FlowNode type constants

type FullAnalysisState

type FullAnalysisState struct {
	*AnalysisState

	// Slices for output (computed on demand)
	Sources          []*InputSource
	TaintedVariables []*TaintedVariable
	TaintedFunctions []*TaintedFunction

	PropagationPaths map[string][]*PropagationPath // source ID -> paths
	ReturnsTainted   map[string]*InputSource       // function name -> source
	// contains filtered or unexported fields
}

Additional fields for full analysis state with O(1) lookups

func NewFullAnalysisState

func NewFullAnalysisState() *FullAnalysisState

NewFullAnalysisState creates a complete analysis state with optimized maps

func (*FullAnalysisState) AddPropagationStep

func (s *FullAnalysisState) AddPropagationStep(source *InputSource, step PropagationStep)

AddPropagationStep adds a propagation step for a source

func (*FullAnalysisState) AddReturnsTaintedFunction

func (s *FullAnalysisState) AddReturnsTaintedFunction(funcName string, source *InputSource)

AddReturnsTaintedFunction marks a function as returning tainted data

func (*FullAnalysisState) AddSource

func (s *FullAnalysisState) AddSource(source *InputSource)

AddSource adds a new input source with O(1) deduplication

func (*FullAnalysisState) AddTaintedFunction

func (s *FullAnalysisState) AddTaintedFunction(tf *TaintedFunction)

AddTaintedFunction adds a tainted function with O(1) deduplication

func (*FullAnalysisState) AddTaintedVariable

func (s *FullAnalysisState) AddTaintedVariable(tv *TaintedVariable)

AddTaintedVariable adds a tainted variable with O(1) deduplication. taintedVarsMap is the single source of truth; AnalysisState.TaintedValues is not written here to avoid the dual-map divergence.

func (*FullAnalysisState) BuildFlowGraph

func (s *FullAnalysisState) BuildFlowGraph() *FlowGraph

BuildFlowGraph builds a flow graph from the analysis state

func (*FullAnalysisState) GetTaintedVariables

func (s *FullAnalysisState) GetTaintedVariables() []*TaintedVariable

GetTaintedVariables returns all tainted variables

func (*FullAnalysisState) IsTainted

func (s *FullAnalysisState) IsTainted(name, scope, filePath string) (*TaintedVariable, bool)

IsTainted reports whether the variable identified by (name, scope, filePath) is tracked in the FullAnalysisState's dedup map.

type FunctionSummary

type FunctionSummary struct {
	Name            string          `json:"name"`
	FilePath        string          `json:"file_path"`
	Language        string          `json:"language"`
	StartLine       int             `json:"start_line"`
	EndLine         int             `json:"end_line"`
	Parameters      []ParameterInfo `json:"parameters"`
	ParamsToReturn  []int           `json:"params_to_return"` // Indices of params that flow to return
	ParamsToParams  map[int][]int   `json:"params_to_params"` // Param N flows to param M in nested calls
	IsSource        bool            `json:"is_source"`        // Function itself returns user input
	CalledFunctions []string        `json:"called_functions"`
}

FunctionSummary captures how a function propagates input

func (*FunctionSummary) GetParamName

func (fs *FunctionSummary) GetParamName(index int) string

GetParamName returns parameter name by index

type InputLabel

type InputLabel = constants.InputLabel

InputLabel categorizes the type of user input Re-exported from pkg/sources/constants for backward compatibility

type InputSource

type InputSource struct {
	ID       string       `json:"id"`
	Type     string       `json:"type"` // e.g., "$_GET", "req.body", "argv"
	Key      string       `json:"key"`  // e.g., "username" in $_GET['username']
	Location Location     `json:"location"`
	Labels   []InputLabel `json:"labels"`
	Language string       `json:"language"`
}

InputSource represents where user input enters the code

type InterproceduralAnalyzer

type InterproceduralAnalyzer struct {
	// contains filtered or unexported fields
}

InterproceduralAnalyzer coordinates cross-function taint analysis. Its single responsibility is orchestrating three subordinate concerns:

  1. Building per-function summaries (BuildFunctionSummary and helpers).
  2. Maintaining the call graph (via the embedded *callGraph).
  3. Propagating taint across function boundaries (PropagateInterproceduralTaint, RunAnalysis, propagateReturnTaint, propagateCallTaint).

func NewInterproceduralAnalyzer

func NewInterproceduralAnalyzer(state *FullAnalysisState, maxDepth int, parserSvc *parser.Service) *InterproceduralAnalyzer

NewInterproceduralAnalyzer creates a new inter-procedural analyzer

func (*InterproceduralAnalyzer) BuildFunctionSummary

func (ipa *InterproceduralAnalyzer) BuildFunctionSummary(node *sitter.Node, src []byte, filePath string, language string) *FunctionSummary

BuildFunctionSummary builds a summary for a function definition

func (*InterproceduralAnalyzer) GetAllSummaries

func (ipa *InterproceduralAnalyzer) GetAllSummaries() map[string]*FunctionSummary

GetAllSummaries returns all function summaries

func (*InterproceduralAnalyzer) GetCallGraph

func (ipa *InterproceduralAnalyzer) GetCallGraph() map[string][]string

GetCallGraph returns a snapshot of the call graph (caller → []callee). The returned map is a deep copy; callers may read it without any lock.

func (*InterproceduralAnalyzer) GetFunctionSummary

func (ipa *InterproceduralAnalyzer) GetFunctionSummary(name string) *FunctionSummary

GetFunctionSummary returns a function summary by name

func (*InterproceduralAnalyzer) PropagateInterproceduralTaint

func (ipa *InterproceduralAnalyzer) PropagateInterproceduralTaint(callNode *sitter.Node, src []byte, filePath string, callerState *FullAnalysisState, visited map[string]bool)

PropagateInterproceduralTaint propagates taint across function boundaries. visited is a caller-owned map used to prevent re-entrant processing of the same call site; the caller must allocate it (make(map[string]bool)) once per top-level propagation request and pass the same map on recursive invocations. Keeping visited outside the struct eliminates the data race that arose when ipa.visited was read and written without holding ipa.mu.

func (*InterproceduralAnalyzer) RunAnalysis

func (ipa *InterproceduralAnalyzer) RunAnalysis(result *TraceResult)

RunAnalysis performs cross-function taint analysis against the provided result, collecting all unique file paths, building function summaries, and iteratively propagating taint until a fixed point is reached.

type Location

type Location struct {
	FilePath  string `json:"file_path"`
	Line      int    `json:"line"`
	Column    int    `json:"column"`
	EndLine   int    `json:"end_line"`
	EndColumn int    `json:"end_column"`
	Snippet   string `json:"snippet,omitempty"`
}

Location represents a precise location in source code

type ParameterInfo

type ParameterInfo struct {
	Index int    `json:"index"`
	Name  string `json:"name"`
	Type  string `json:"type,omitempty"`
}

ParameterInfo contains information about a function parameter

type PropagationPath

type PropagationPath struct {
	Source      *InputSource      `json:"source"`
	Steps       []PropagationStep `json:"steps"`
	Destination Location          `json:"destination"`
}

PropagationPath shows how input flows from source to destination

type PropagationStep

type PropagationStep struct {
	Type     PropagationStepType `json:"type"`
	Variable string              `json:"variable"`
	Function string              `json:"function,omitempty"` // If crossing function boundary
	Location Location            `json:"location"`
}

PropagationStep is one step in the propagation chain

type PropagationStepType

type PropagationStepType = constants.PropagationStepType

PropagationStepType defines the type of propagation step Re-exported from pkg/sources/constants for backward compatibility

type Scope

type Scope struct {
	ID        string                      `json:"id"`
	Type      ScopeType                   `json:"type"`
	Name      string                      `json:"name"`
	Parent    *Scope                      `json:"-"` // Avoid circular JSON
	ParentID  string                      `json:"parent_id,omitempty"`
	Children  []*Scope                    `json:"-"` // Child scopes
	Variables map[string]*TaintedVariable `json:"-"`
	StartLine int                         `json:"start_line"`
	EndLine   int                         `json:"end_line"`
	StartLoc  Location                    `json:"start_location,omitempty"`
}

Scope represents a variable scope in the code

type ScopeState

type ScopeState struct {
	CurrentScope *Scope
	ScopeStack   []*Scope
}

ScopeState manages the scope stack during AST traversal. It is a pure scope-management concern: push/pop scopes, look up variables within the lexical hierarchy. It knows nothing about taint tracking.

func (*ScopeState) EnterScope

func (ss *ScopeState) EnterScope(scopeType ScopeType, name string, startLine, endLine int) *Scope

EnterScope pushes a new named scope onto the stack.

func (*ScopeState) ExitScope

func (ss *ScopeState) ExitScope() *Scope

ExitScope pops the current scope and restores the parent.

func (*ScopeState) LookupVariable

func (ss *ScopeState) LookupVariable(name string) (*TaintedVariable, bool)

LookupVariable searches for a variable from the current scope upward.

type ScopeType

type ScopeType = constants.ScopeType

ScopeType represents the type of scope Re-exported from pkg/sources/constants for backward compatibility

type TaintInfo

type TaintInfo struct {
	Source *InputSource
	Depth  int
}

TaintInfo contains information about a tainted value

type TaintPropagator

type TaintPropagator struct {
	// contains filtered or unexported fields
}

TaintPropagator handles taint propagation through code

func NewTaintPropagator

func NewTaintPropagator(state *FullAnalysisState, language string) *TaintPropagator

NewTaintPropagator creates a new taint propagator

func (*TaintPropagator) PropagateFromAssignment

func (prop *TaintPropagator) PropagateFromAssignment(node *sitter.Node, src []byte, filePath string)

PropagateFromAssignment propagates taint from an assignment expression

func (*TaintPropagator) PropagateFromFunctionCall

func (prop *TaintPropagator) PropagateFromFunctionCall(node *sitter.Node, src []byte, filePath string)

PropagateFromFunctionCall propagates taint through function calls

func (*TaintPropagator) PropagateFromReturn

func (prop *TaintPropagator) PropagateFromReturn(node *sitter.Node, src []byte, filePath string)

PropagateFromReturn propagates taint from return statements

type TaintedFunction

type TaintedFunction struct {
	ID              string            `json:"id"`
	Name            string            `json:"name"`
	FilePath        string            `json:"file_path"`
	Line            int               `json:"line"`
	Language        string            `json:"language"`
	TaintedParams   []TaintedParam    `json:"tainted_params"`
	ReceivesThrough []PropagationPath `json:"receives_through,omitempty"`
}

TaintedFunction represents a function that receives user input

type TaintedParam

type TaintedParam struct {
	Index  int              `json:"index"`
	Name   string           `json:"name"`
	Source *InputSource     `json:"source"`
	Path   *PropagationPath `json:"path,omitempty"`
}

TaintedParam represents a function parameter that receives user input

type TaintedVariable

type TaintedVariable struct {
	ID       string       `json:"id"`
	Name     string       `json:"name"`
	Scope    string       `json:"scope"`  // Function/class scope
	Source   *InputSource `json:"source"` // Original input source
	Location Location     `json:"location"`
	Depth    int          `json:"depth"` // How many assignments from original source
	Language string       `json:"language"`
}

TaintedVariable represents a variable that holds user input at some point

type TraceResult

type TraceResult struct {
	// All discovered input sources
	Sources []*InputSource `json:"sources"`

	// All variables that hold user input at some point
	TaintedVariables []*TaintedVariable `json:"tainted_variables"`

	// All functions that receive user input (directly or transitively)
	TaintedFunctions []*TaintedFunction `json:"tainted_functions"`

	// Complete flow graph
	FlowGraph *FlowGraph `json:"flow_graph"`

	// Statistics
	Stats TraceStats `json:"stats"`

	// Errors encountered during analysis (parse errors, permission errors, etc.)
	Errors []error `json:"errors,omitempty"`
}

TraceResult is the complete result of tracing a codebase

func (*TraceResult) MarshalJSON

func (r *TraceResult) MarshalJSON() ([]byte, error)

MarshalJSON implements json.Marshaler so that the []error Errors field is serialized as a JSON array of strings (by calling .Error() on each entry) rather than as an array of empty objects (which is what encoding/json produces for interface values by default).

func (*TraceResult) ToJSON

func (r *TraceResult) ToJSON() (string, error)

ToJSON converts the trace result to JSON

type TraceStats

type TraceStats struct {
	FilesAnalyzed     int            `json:"files_analyzed"`
	SourcesFound      int            `json:"sources_found"`
	TaintedVarsFound  int            `json:"tainted_variables_found"`
	TaintedFuncsFound int            `json:"tainted_functions_found"`
	PropagationPaths  int            `json:"propagation_paths"`
	AnalysisDuration  time.Duration  `json:"analysis_duration_ns"`
	DurationMs        int64          `json:"analysis_duration_ms"`
	ByLanguage        map[string]int `json:"files_by_language"`
}

TraceStats contains analysis statistics

type Tracer

type Tracer struct {
	// contains filtered or unexported fields
}

Tracer is the main entry point for input tracing

func New

func New(config *Config) *Tracer

New creates a new Tracer with the given configuration

func (*Tracer) DoesReceiveInput

func (t *Tracer) DoesReceiveInput(result *TraceResult, funcName string) bool

DoesReceiveInput checks if a specific function receives user input

func (*Tracer) GetFlowPaths

func (t *Tracer) GetFlowPaths(result *TraceResult, source *InputSource) []*PropagationPath

GetFlowPaths returns all propagation paths from a specific source

func (*Tracer) GetInputSources

func (t *Tracer) GetInputSources(result *TraceResult) []*InputSource

GetInputSources returns all input sources found

func (*Tracer) GetTaintedFunctions

func (t *Tracer) GetTaintedFunctions(result *TraceResult) []*TaintedFunction

GetTaintedFunctions returns all functions that receive user input

func (*Tracer) GetTaintedVariables

func (t *Tracer) GetTaintedVariables(result *TraceResult) []*TaintedVariable

GetTaintedVariables returns all variables that hold user input

func (*Tracer) TraceDirectory

func (t *Tracer) TraceDirectory(dirPath string) (*TraceResult, error)

TraceDirectory analyzes a directory and returns all input flow information

func (*Tracer) TraceFile

func (t *Tracer) TraceFile(filePath string) (*TraceResult, error)

TraceFile analyzes a single source file and returns all input flow information found within it. Unlike TraceDirectory it does NOT walk the filesystem or run inter-procedural analysis across multiple files — taint propagation is limited to what can be observed within filePath alone.

The returned TraceResult follows the same schema as TraceDirectory so callers can use the same output/reporting code for both entry points.

filePath must be an absolute or relative path to a regular file. If the file cannot be parsed (unsupported language, I/O error, etc.) the error is recorded in TraceResult.Errors rather than returned as the function error; a non-nil function error is only returned for truly unexpected failures.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL