Documentation
¶
Index ¶
- Constants
- type AnalysisState
- type Argument
- type Config
- type FlowEdge
- type FlowEdgeType
- type FlowGraph
- type FlowNode
- type FlowNodeType
- type FullAnalysisState
- func (s *FullAnalysisState) AddPropagationStep(source *InputSource, step PropagationStep)
- func (s *FullAnalysisState) AddReturnsTaintedFunction(funcName string, source *InputSource)
- func (s *FullAnalysisState) AddSource(source *InputSource)
- func (s *FullAnalysisState) AddTaintedFunction(tf *TaintedFunction)
- func (s *FullAnalysisState) AddTaintedVariable(tv *TaintedVariable)
- func (s *FullAnalysisState) BuildFlowGraph() *FlowGraph
- func (s *FullAnalysisState) GetTaintedVariables() []*TaintedVariable
- func (s *FullAnalysisState) IsTainted(name, scope, filePath string) (*TaintedVariable, bool)
- type FunctionSummary
- type InputLabel
- type InputSource
- type InterproceduralAnalyzer
- func (ipa *InterproceduralAnalyzer) BuildFunctionSummary(node *sitter.Node, src []byte, filePath string, language string) *FunctionSummary
- func (ipa *InterproceduralAnalyzer) GetAllSummaries() map[string]*FunctionSummary
- func (ipa *InterproceduralAnalyzer) GetCallGraph() map[string][]string
- func (ipa *InterproceduralAnalyzer) GetFunctionSummary(name string) *FunctionSummary
- func (ipa *InterproceduralAnalyzer) PropagateInterproceduralTaint(callNode *sitter.Node, src []byte, filePath string, ...)
- func (ipa *InterproceduralAnalyzer) RunAnalysis(result *TraceResult)
- type Location
- type ParameterInfo
- type PropagationPath
- type PropagationStep
- type PropagationStepType
- type Scope
- type ScopeState
- type ScopeType
- type TaintInfo
- type TaintPropagator
- func (prop *TaintPropagator) PropagateFromAssignment(node *sitter.Node, src []byte, filePath string)
- func (prop *TaintPropagator) PropagateFromFunctionCall(node *sitter.Node, src []byte, filePath string)
- func (prop *TaintPropagator) PropagateFromReturn(node *sitter.Node, src []byte, filePath string)
- type TaintedFunction
- type TaintedParam
- type TaintedVariable
- type TraceResult
- type TraceStats
- type Tracer
- func (t *Tracer) DoesReceiveInput(result *TraceResult, funcName string) bool
- func (t *Tracer) GetFlowPaths(result *TraceResult, source *InputSource) []*PropagationPath
- func (t *Tracer) GetInputSources(result *TraceResult) []*InputSource
- func (t *Tracer) GetTaintedFunctions(result *TraceResult) []*TaintedFunction
- func (t *Tracer) GetTaintedVariables(result *TraceResult) []*TaintedVariable
- func (t *Tracer) TraceDirectory(dirPath string) (*TraceResult, error)
- func (t *Tracer) TraceFile(filePath string) (*TraceResult, error)
Constants ¶
const ( LabelHTTPGet = constants.LabelHTTPGet LabelHTTPPost = constants.LabelHTTPPost LabelHTTPCookie = constants.LabelHTTPCookie LabelHTTPHeader = constants.LabelHTTPHeader LabelHTTPBody = constants.LabelHTTPBody LabelCLI = constants.LabelCLI LabelEnvironment = constants.LabelEnvironment LabelFile = constants.LabelFile LabelDatabase = constants.LabelDatabase LabelNetwork = constants.LabelNetwork LabelUserInput = constants.LabelUserInput )
Re-export InputLabel constants for backward compatibility
const ( StepAssignment = constants.StepAssignment StepParameterPass = constants.StepParameterPass StepReturn = constants.StepReturn StepInterproceduralReturn = constants.StepInterproceduralReturn StepConcatenation = constants.StepConcatenation StepArrayAccess = constants.StepArrayAccess StepObjectAccess = constants.StepObjectAccess StepDestructure = constants.StepDestructure )
Re-export PropagationStepType constants for backward compatibility
const ( ScopeGlobal = constants.ScopeGlobal ScopeFile = constants.ScopeFile ScopeModule = constants.ScopeModule ScopeClass = constants.ScopeClass ScopeFunction = constants.ScopeFunction ScopeBlock = constants.ScopeBlock )
Re-export ScopeType constants for backward compatibility
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type AnalysisState ¶
type AnalysisState struct {
ScopeState // scope management (push/pop, lookup)
TaintedValues map[string]*TaintedVariable // variable name -> tainted info (taint tracking)
FunctionSummaries map[string]*FunctionSummary
VisitedFunctions map[string]bool
}
AnalysisState maintains the current state during analysis of a single file.
It embeds ScopeState for scope management (push/pop scopes, hierarchical variable lookup) and adds taint-tracking fields (TaintedValues, FunctionSummaries, VisitedFunctions) as a separate concern. Keeping them in one struct is intentional for this package: both concerns are needed together on every file-analysis pass and separating them into two arguments everywhere would add noise without benefit. The split is expressed structurally via the embedded ScopeState so that the two concerns remain conceptually distinct.
See also: pkg/semantic/types.AnalysisState, which is an unrelated type used by the deep semantic analysis layer. The two types serve different levels of abstraction and should not be merged (see M26 note in types.go).
func NewAnalysisState ¶
func NewAnalysisState() *AnalysisState
NewAnalysisState creates a new analysis state with global scope
func (*AnalysisState) EnterScope ¶
func (s *AnalysisState) EnterScope(scopeType ScopeType, name string, startLine, endLine int) *Scope
EnterScope creates and enters a new scope (delegates to ScopeState).
func (*AnalysisState) ExitScope ¶
func (s *AnalysisState) ExitScope() *Scope
ExitScope exits the current scope and returns to parent (delegates to ScopeState).
func (*AnalysisState) LookupVariable ¶
func (s *AnalysisState) LookupVariable(name string) (*TaintedVariable, bool)
LookupVariable looks up a variable in current and parent scopes, then falls back to the flat TaintedValues map for file-global tainted variables.
func (*AnalysisState) SetTainted ¶
func (s *AnalysisState) SetTainted(name string, tainted *TaintedVariable)
SetTainted marks a variable as tainted in the current scope and in the flat TaintedValues map so that findTaintInfo can reach it via O(1) lookup.
type Config ¶
type Config struct {
// Languages to analyze (empty = all supported)
Languages []string
// Maximum inter-procedural analysis depth
MaxDepth int
// Number of parallel workers
Workers int
// Custom source definitions (in addition to built-in)
CustomSources []sources.Definition
// Skip directories matching these patterns
SkipDirs []string
// Include only files matching these patterns (empty = all)
IncludePatterns []string
// Verbose enables diagnostic logging to stdout during analysis
Verbose bool
}
Config configures the tracer
func DefaultConfig ¶
func DefaultConfig() *Config
DefaultConfig returns sensible defaults using centralized sources
type FlowEdge ¶
type FlowEdge struct {
From string `json:"from"` // Node ID
To string `json:"to"` // Node ID
Type FlowEdgeType `json:"type"`
Location Location `json:"location"`
}
FlowEdge connects two nodes showing data flow
type FlowEdgeType ¶
type FlowEdgeType = constants.FlowEdgeType
FlowEdgeType represents how data flows between two nodes. Re-exported from pkg/sources/constants.
const ( FlowEdgeAssignment FlowEdgeType = constants.EdgeAssignment FlowEdgeCall FlowEdgeType = constants.EdgeCall FlowEdgeReturn FlowEdgeType = constants.EdgeReturn FlowEdgeTaint FlowEdgeType = constants.EdgeDataFlow FlowEdgeParameter FlowEdgeType = constants.EdgeParameter )
FlowEdge type constants
type FlowNode ¶
type FlowNode struct {
ID string `json:"id"`
Type FlowNodeType `json:"type"`
Name string `json:"name"`
Location Location `json:"location"`
}
FlowNode is a node in the flow graph
type FlowNodeType ¶
type FlowNodeType = constants.FlowNodeType
FlowNodeType represents the type of a node in the data flow graph. Re-exported from pkg/sources/constants.
const ( FlowNodeSource FlowNodeType = constants.NodeSource FlowNodeVariable FlowNodeType = constants.NodeVariable FlowNodeFunction FlowNodeType = constants.NodeFunction FlowNodeParam FlowNodeType = constants.NodeParam FlowNodeCarrier FlowNodeType = constants.NodeCarrier FlowNodeProperty FlowNodeType = constants.NodeProperty FlowNodeReturn FlowNodeType = constants.NodeReturn )
FlowNode type constants
type FullAnalysisState ¶
type FullAnalysisState struct {
*AnalysisState
// Slices for output (computed on demand)
Sources []*InputSource
TaintedVariables []*TaintedVariable
TaintedFunctions []*TaintedFunction
PropagationPaths map[string][]*PropagationPath // source ID -> paths
ReturnsTainted map[string]*InputSource // function name -> source
// contains filtered or unexported fields
}
Additional fields for full analysis state with O(1) lookups
func NewFullAnalysisState ¶
func NewFullAnalysisState() *FullAnalysisState
NewFullAnalysisState creates a complete analysis state with optimized maps
func (*FullAnalysisState) AddPropagationStep ¶
func (s *FullAnalysisState) AddPropagationStep(source *InputSource, step PropagationStep)
AddPropagationStep adds a propagation step for a source
func (*FullAnalysisState) AddReturnsTaintedFunction ¶
func (s *FullAnalysisState) AddReturnsTaintedFunction(funcName string, source *InputSource)
AddReturnsTaintedFunction marks a function as returning tainted data
func (*FullAnalysisState) AddSource ¶
func (s *FullAnalysisState) AddSource(source *InputSource)
AddSource adds a new input source with O(1) deduplication
func (*FullAnalysisState) AddTaintedFunction ¶
func (s *FullAnalysisState) AddTaintedFunction(tf *TaintedFunction)
AddTaintedFunction adds a tainted function with O(1) deduplication
func (*FullAnalysisState) AddTaintedVariable ¶
func (s *FullAnalysisState) AddTaintedVariable(tv *TaintedVariable)
AddTaintedVariable adds a tainted variable with O(1) deduplication. taintedVarsMap is the single source of truth; AnalysisState.TaintedValues is not written here to avoid the dual-map divergence.
func (*FullAnalysisState) BuildFlowGraph ¶
func (s *FullAnalysisState) BuildFlowGraph() *FlowGraph
BuildFlowGraph builds a flow graph from the analysis state
func (*FullAnalysisState) GetTaintedVariables ¶
func (s *FullAnalysisState) GetTaintedVariables() []*TaintedVariable
GetTaintedVariables returns all tainted variables
func (*FullAnalysisState) IsTainted ¶
func (s *FullAnalysisState) IsTainted(name, scope, filePath string) (*TaintedVariable, bool)
IsTainted reports whether the variable identified by (name, scope, filePath) is tracked in the FullAnalysisState's dedup map.
type FunctionSummary ¶
type FunctionSummary struct {
Name string `json:"name"`
FilePath string `json:"file_path"`
Language string `json:"language"`
StartLine int `json:"start_line"`
EndLine int `json:"end_line"`
Parameters []ParameterInfo `json:"parameters"`
ParamsToReturn []int `json:"params_to_return"` // Indices of params that flow to return
ParamsToParams map[int][]int `json:"params_to_params"` // Param N flows to param M in nested calls
IsSource bool `json:"is_source"` // Function itself returns user input
CalledFunctions []string `json:"called_functions"`
}
FunctionSummary captures how a function propagates input
func (*FunctionSummary) GetParamName ¶
func (fs *FunctionSummary) GetParamName(index int) string
GetParamName returns parameter name by index
type InputLabel ¶
type InputLabel = constants.InputLabel
InputLabel categorizes the type of user input Re-exported from pkg/sources/constants for backward compatibility
type InputSource ¶
type InputSource struct {
ID string `json:"id"`
Type string `json:"type"` // e.g., "$_GET", "req.body", "argv"
Key string `json:"key"` // e.g., "username" in $_GET['username']
Location Location `json:"location"`
Labels []InputLabel `json:"labels"`
Language string `json:"language"`
}
InputSource represents where user input enters the code
type InterproceduralAnalyzer ¶
type InterproceduralAnalyzer struct {
// contains filtered or unexported fields
}
InterproceduralAnalyzer coordinates cross-function taint analysis. Its single responsibility is orchestrating three subordinate concerns:
- Building per-function summaries (BuildFunctionSummary and helpers).
- Maintaining the call graph (via the embedded *callGraph).
- Propagating taint across function boundaries (PropagateInterproceduralTaint, RunAnalysis, propagateReturnTaint, propagateCallTaint).
func NewInterproceduralAnalyzer ¶
func NewInterproceduralAnalyzer(state *FullAnalysisState, maxDepth int, parserSvc *parser.Service) *InterproceduralAnalyzer
NewInterproceduralAnalyzer creates a new inter-procedural analyzer
func (*InterproceduralAnalyzer) BuildFunctionSummary ¶
func (ipa *InterproceduralAnalyzer) BuildFunctionSummary(node *sitter.Node, src []byte, filePath string, language string) *FunctionSummary
BuildFunctionSummary builds a summary for a function definition
func (*InterproceduralAnalyzer) GetAllSummaries ¶
func (ipa *InterproceduralAnalyzer) GetAllSummaries() map[string]*FunctionSummary
GetAllSummaries returns all function summaries
func (*InterproceduralAnalyzer) GetCallGraph ¶
func (ipa *InterproceduralAnalyzer) GetCallGraph() map[string][]string
GetCallGraph returns a snapshot of the call graph (caller → []callee). The returned map is a deep copy; callers may read it without any lock.
func (*InterproceduralAnalyzer) GetFunctionSummary ¶
func (ipa *InterproceduralAnalyzer) GetFunctionSummary(name string) *FunctionSummary
GetFunctionSummary returns a function summary by name
func (*InterproceduralAnalyzer) PropagateInterproceduralTaint ¶
func (ipa *InterproceduralAnalyzer) PropagateInterproceduralTaint(callNode *sitter.Node, src []byte, filePath string, callerState *FullAnalysisState, visited map[string]bool)
PropagateInterproceduralTaint propagates taint across function boundaries. visited is a caller-owned map used to prevent re-entrant processing of the same call site; the caller must allocate it (make(map[string]bool)) once per top-level propagation request and pass the same map on recursive invocations. Keeping visited outside the struct eliminates the data race that arose when ipa.visited was read and written without holding ipa.mu.
func (*InterproceduralAnalyzer) RunAnalysis ¶
func (ipa *InterproceduralAnalyzer) RunAnalysis(result *TraceResult)
RunAnalysis performs cross-function taint analysis against the provided result, collecting all unique file paths, building function summaries, and iteratively propagating taint until a fixed point is reached.
type Location ¶
type Location struct {
FilePath string `json:"file_path"`
Line int `json:"line"`
Column int `json:"column"`
EndLine int `json:"end_line"`
EndColumn int `json:"end_column"`
Snippet string `json:"snippet,omitempty"`
}
Location represents a precise location in source code
type ParameterInfo ¶
type ParameterInfo struct {
Index int `json:"index"`
Name string `json:"name"`
Type string `json:"type,omitempty"`
}
ParameterInfo contains information about a function parameter
type PropagationPath ¶
type PropagationPath struct {
Source *InputSource `json:"source"`
Steps []PropagationStep `json:"steps"`
Destination Location `json:"destination"`
}
PropagationPath shows how input flows from source to destination
type PropagationStep ¶
type PropagationStep struct {
Type PropagationStepType `json:"type"`
Variable string `json:"variable"`
Function string `json:"function,omitempty"` // If crossing function boundary
Location Location `json:"location"`
}
PropagationStep is one step in the propagation chain
type PropagationStepType ¶
type PropagationStepType = constants.PropagationStepType
PropagationStepType defines the type of propagation step Re-exported from pkg/sources/constants for backward compatibility
type Scope ¶
type Scope struct {
ID string `json:"id"`
Type ScopeType `json:"type"`
Name string `json:"name"`
Parent *Scope `json:"-"` // Avoid circular JSON
ParentID string `json:"parent_id,omitempty"`
Children []*Scope `json:"-"` // Child scopes
Variables map[string]*TaintedVariable `json:"-"`
StartLine int `json:"start_line"`
EndLine int `json:"end_line"`
StartLoc Location `json:"start_location,omitempty"`
}
Scope represents a variable scope in the code
type ScopeState ¶
ScopeState manages the scope stack during AST traversal. It is a pure scope-management concern: push/pop scopes, look up variables within the lexical hierarchy. It knows nothing about taint tracking.
func (*ScopeState) EnterScope ¶
func (ss *ScopeState) EnterScope(scopeType ScopeType, name string, startLine, endLine int) *Scope
EnterScope pushes a new named scope onto the stack.
func (*ScopeState) ExitScope ¶
func (ss *ScopeState) ExitScope() *Scope
ExitScope pops the current scope and restores the parent.
func (*ScopeState) LookupVariable ¶
func (ss *ScopeState) LookupVariable(name string) (*TaintedVariable, bool)
LookupVariable searches for a variable from the current scope upward.
type ScopeType ¶
ScopeType represents the type of scope Re-exported from pkg/sources/constants for backward compatibility
type TaintInfo ¶
type TaintInfo struct {
Source *InputSource
Depth int
}
TaintInfo contains information about a tainted value
type TaintPropagator ¶
type TaintPropagator struct {
// contains filtered or unexported fields
}
TaintPropagator handles taint propagation through code
func NewTaintPropagator ¶
func NewTaintPropagator(state *FullAnalysisState, language string) *TaintPropagator
NewTaintPropagator creates a new taint propagator
func (*TaintPropagator) PropagateFromAssignment ¶
func (prop *TaintPropagator) PropagateFromAssignment(node *sitter.Node, src []byte, filePath string)
PropagateFromAssignment propagates taint from an assignment expression
func (*TaintPropagator) PropagateFromFunctionCall ¶
func (prop *TaintPropagator) PropagateFromFunctionCall(node *sitter.Node, src []byte, filePath string)
PropagateFromFunctionCall propagates taint through function calls
func (*TaintPropagator) PropagateFromReturn ¶
func (prop *TaintPropagator) PropagateFromReturn(node *sitter.Node, src []byte, filePath string)
PropagateFromReturn propagates taint from return statements
type TaintedFunction ¶
type TaintedFunction struct {
ID string `json:"id"`
Name string `json:"name"`
FilePath string `json:"file_path"`
Line int `json:"line"`
Language string `json:"language"`
TaintedParams []TaintedParam `json:"tainted_params"`
ReceivesThrough []PropagationPath `json:"receives_through,omitempty"`
}
TaintedFunction represents a function that receives user input
type TaintedParam ¶
type TaintedParam struct {
Index int `json:"index"`
Name string `json:"name"`
Source *InputSource `json:"source"`
Path *PropagationPath `json:"path,omitempty"`
}
TaintedParam represents a function parameter that receives user input
type TaintedVariable ¶
type TaintedVariable struct {
ID string `json:"id"`
Name string `json:"name"`
Scope string `json:"scope"` // Function/class scope
Source *InputSource `json:"source"` // Original input source
Location Location `json:"location"`
Depth int `json:"depth"` // How many assignments from original source
Language string `json:"language"`
}
TaintedVariable represents a variable that holds user input at some point
type TraceResult ¶
type TraceResult struct {
// All discovered input sources
Sources []*InputSource `json:"sources"`
// All variables that hold user input at some point
TaintedVariables []*TaintedVariable `json:"tainted_variables"`
// All functions that receive user input (directly or transitively)
TaintedFunctions []*TaintedFunction `json:"tainted_functions"`
// Complete flow graph
FlowGraph *FlowGraph `json:"flow_graph"`
// Statistics
Stats TraceStats `json:"stats"`
// Errors encountered during analysis (parse errors, permission errors, etc.)
Errors []error `json:"errors,omitempty"`
}
TraceResult is the complete result of tracing a codebase
func (*TraceResult) MarshalJSON ¶
func (r *TraceResult) MarshalJSON() ([]byte, error)
MarshalJSON implements json.Marshaler so that the []error Errors field is serialized as a JSON array of strings (by calling .Error() on each entry) rather than as an array of empty objects (which is what encoding/json produces for interface values by default).
func (*TraceResult) ToJSON ¶
func (r *TraceResult) ToJSON() (string, error)
ToJSON converts the trace result to JSON
type TraceStats ¶
type TraceStats struct {
FilesAnalyzed int `json:"files_analyzed"`
SourcesFound int `json:"sources_found"`
TaintedVarsFound int `json:"tainted_variables_found"`
TaintedFuncsFound int `json:"tainted_functions_found"`
PropagationPaths int `json:"propagation_paths"`
AnalysisDuration time.Duration `json:"analysis_duration_ns"`
DurationMs int64 `json:"analysis_duration_ms"`
ByLanguage map[string]int `json:"files_by_language"`
}
TraceStats contains analysis statistics
type Tracer ¶
type Tracer struct {
// contains filtered or unexported fields
}
Tracer is the main entry point for input tracing
func (*Tracer) DoesReceiveInput ¶
func (t *Tracer) DoesReceiveInput(result *TraceResult, funcName string) bool
DoesReceiveInput checks if a specific function receives user input
func (*Tracer) GetFlowPaths ¶
func (t *Tracer) GetFlowPaths(result *TraceResult, source *InputSource) []*PropagationPath
GetFlowPaths returns all propagation paths from a specific source
func (*Tracer) GetInputSources ¶
func (t *Tracer) GetInputSources(result *TraceResult) []*InputSource
GetInputSources returns all input sources found
func (*Tracer) GetTaintedFunctions ¶
func (t *Tracer) GetTaintedFunctions(result *TraceResult) []*TaintedFunction
GetTaintedFunctions returns all functions that receive user input
func (*Tracer) GetTaintedVariables ¶
func (t *Tracer) GetTaintedVariables(result *TraceResult) []*TaintedVariable
GetTaintedVariables returns all variables that hold user input
func (*Tracer) TraceDirectory ¶
func (t *Tracer) TraceDirectory(dirPath string) (*TraceResult, error)
TraceDirectory analyzes a directory and returns all input flow information
func (*Tracer) TraceFile ¶
func (t *Tracer) TraceFile(filePath string) (*TraceResult, error)
TraceFile analyzes a single source file and returns all input flow information found within it. Unlike TraceDirectory it does NOT walk the filesystem or run inter-procedural analysis across multiple files — taint propagation is limited to what can be observed within filePath alone.
The returned TraceResult follows the same schema as TraceDirectory so callers can use the same output/reporting code for both entry points.
filePath must be an absolute or relative path to a regular file. If the file cannot be parsed (unsupported language, I/O error, etc.) the error is recorded in TraceResult.Errors rather than returned as the function error; a non-nil function error is only returned for truly unexpected failures.