Documentation
¶
Overview ¶
Package semantic provides a complete semantic input tracer that analyzes codebases to trace user input flow with full cross-file, inter-procedural analysis.
Index ¶
- func ToDOT(r *TraceResult) string
- func ToHTML(r *TraceResult) string
- func ToJSON(r *TraceResult) (string, error)
- func ToMermaid(r *TraceResult) string
- type Config
- type FileInfo
- type LanguageStats
- type TraceContext
- type TraceResult
- func (r *TraceResult) GetSourcesByFile(filePath string) []*types.FlowNode
- func (r *TraceResult) GetSourcesByType(sourceType types.SourceType) []*types.FlowNode
- func (r *TraceResult) HasInputAtFunction(funcName string) bool
- func (r *TraceResult) ToDOT() string
- func (r *TraceResult) ToHTML() string
- func (r *TraceResult) ToJSON() (string, error)
- func (r *TraceResult) ToMermaid() string
- type TraceStats
- type Tracer
- func (t *Tracer) Close()
- func (t *Tracer) ParseOnly(path string) (*TraceResult, error)
- func (t *Tracer) TraceBackward(target string, codebasePath string) (*types.BackwardTraceResult, error)
- func (t *Tracer) TraceBackwardBatch(targets []string, codebasePath string) (*types.BatchTraceResult, error)
- func (t *Tracer) TraceDirectory(path string) (*TraceResult, error)
- func (t *Tracer) TraceFile(path string) (*TraceResult, error)
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ToDOT ¶
func ToDOT(r *TraceResult) string
ToDOT converts the trace result to GraphViz DOT format
func ToHTML ¶
func ToHTML(r *TraceResult) string
ToHTML converts the trace result to interactive HTML format
func ToJSON ¶
func ToJSON(r *TraceResult) (string, error)
ToJSON converts the trace result to JSON format
func ToMermaid ¶
func ToMermaid(r *TraceResult) string
ToMermaid converts the trace result to Mermaid diagram format
Types ¶
type Config ¶
type Config struct {
// Languages to analyze (empty = auto-detect all)
Languages []string
// MaxDepth for inter-procedural analysis
MaxDepth int
// Workers for parallel analysis
Workers int
// FollowImports enables cross-file analysis
FollowImports bool
// Verbose enables detailed logging
Verbose bool
// IncludePatterns for file filtering (glob patterns)
IncludePatterns []string
// ExcludePatterns for file filtering (glob patterns)
ExcludePatterns []string
// MaxMemoryMB is the maximum memory usage in MB (0 = use default 100MB)
// Applied to all modes to prevent OOM on large codebases
MaxMemoryMB int
// MaxFileSizeBytes is the maximum file size to parse (0 = unlimited)
MaxFileSizeBytes int64
// MaxFiles is the maximum number of files to parse (0 = unlimited)
MaxFiles int
// MaxFlowNodes is the maximum number of nodes in the flow graph (0 = default 10000)
MaxFlowNodes int
// MaxFlowEdges is the maximum number of edges in the flow graph (0 = default 20000)
MaxFlowEdges int
}
Config configures the semantic tracer
type FileInfo ¶
type FileInfo struct {
Path string
Language string
SymbolTable *types.SymbolTable
Sources []*types.FlowNode
Assignments []*types.Assignment // Cached assignments for flow tracing (avoids re-parsing)
Calls []*types.CallSite // Cached calls for flow tracing (avoids re-parsing)
Root *sitter.Node // Only populated during parsing, released after
Content []byte // Only populated if NeedsReparse is false
ParseTime time.Duration
Error error
// NeedsReparse indicates the file needs re-parsing for deeper analysis
// (AST was released to save memory)
NeedsReparse bool
}
FileInfo holds information about a parsed file Optimized to not retain AST and file content in memory after parsing
type LanguageStats ¶
type LanguageStats struct {
Files int
Sources int
Flows int
ParseErrors int
ParseTime time.Duration
AnalysisTime time.Duration
}
LanguageStats holds per-language statistics
type TraceContext ¶
type TraceContext struct {
// contains filtered or unexported fields
}
TraceContext provides per-trace-invocation isolation for thread safety Each TraceBackward() call gets its own context with: - Own parser instances (not shared → thread-safe) - Cached assignments ONLY (extracted once per file, reused in recursion) - NO AST caching (ASTs are huge, assignments are tiny) - Released on completion (memory-efficient)
func (*TraceContext) Close ¶
func (ctx *TraceContext) Close()
Close releases all resources held by the context
type TraceResult ¶
type TraceResult struct {
// All discovered input sources
Sources []*types.FlowNode
// Complete flow map
FlowMap *types.FlowMap
// Per-file information
Files map[string]*FileInfo
// Global symbol table (merged from all files)
GlobalSymbolTable *types.SymbolTable
// Per-file symbol tables (for symbolic execution)
SymbolTable map[string]*types.SymbolTable
// Statistics
Stats *TraceStats
}
TraceResult is the complete result of semantic tracing
func (*TraceResult) GetSourcesByFile ¶
func (r *TraceResult) GetSourcesByFile(filePath string) []*types.FlowNode
GetSourcesByFile returns sources in a specific file
func (*TraceResult) GetSourcesByType ¶
func (r *TraceResult) GetSourcesByType(sourceType types.SourceType) []*types.FlowNode
GetSourcesByType returns sources filtered by type
func (*TraceResult) HasInputAtFunction ¶
func (r *TraceResult) HasInputAtFunction(funcName string) bool
HasInputAtFunction checks if a function receives user input
func (*TraceResult) ToDOT ¶
func (r *TraceResult) ToDOT() string
ToDOT outputs the result as GraphViz DOT
func (*TraceResult) ToHTML ¶
func (r *TraceResult) ToHTML() string
ToHTML outputs the result as interactive HTML
func (*TraceResult) ToJSON ¶
func (r *TraceResult) ToJSON() (string, error)
ToJSON outputs the result as JSON
func (*TraceResult) ToMermaid ¶
func (r *TraceResult) ToMermaid() string
ToMermaid outputs the result as Mermaid diagram
type TraceStats ¶
type TraceStats struct {
FilesScanned int
FilesParsed int
FilesSkipped int
ParseErrors int
SourcesFound int
FlowsTraced int
CrossFileFlows int
TotalDuration time.Duration
ParseDuration time.Duration
AnalysisDuration time.Duration
ByLanguage map[string]*LanguageStats
}
TraceStats holds tracing statistics
type Tracer ¶
type Tracer struct {
// contains filtered or unexported fields
}
Tracer is the main semantic input tracer
func (*Tracer) ParseOnly ¶
func (t *Tracer) ParseOnly(path string) (*TraceResult, error)
ParseOnly parses files and builds symbol tables without flow analysis (fast mode for symbolic tracing)
func (*Tracer) TraceBackward ¶
func (t *Tracer) TraceBackward(target string, codebasePath string) (*types.BackwardTraceResult, error)
TraceBackward performs backward taint analysis from a target expression. This traces from a target variable/expression back to its input sources.
func (*Tracer) TraceBackwardBatch ¶
func (t *Tracer) TraceBackwardBatch(targets []string, codebasePath string) (*types.BatchTraceResult, error)
TraceBackwardBatch performs backward taint analysis for MULTIPLE target expressions in a SINGLE pass. This is CRITICAL for performance: instead of N × files reads (for N variables), we do a single pass through all files, checking all variables at once. PERF: Shares TraceContext and assignment cache across all variables.
func (*Tracer) TraceDirectory ¶
func (t *Tracer) TraceDirectory(path string) (*TraceResult, error)
TraceDirectory performs semantic tracing on a directory
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
Package analyzer defines the interface for language-specific analyzers
|
Package analyzer defines the interface for language-specific analyzers |
|
base
Package base provides shared helpers for language analyzers.
|
Package base provides shared helpers for language analyzers. |
|
c
Package c implements the C language analyzer for semantic input tracing
|
Package c implements the C language analyzer for semantic input tracing |
|
cpp
Package cpp implements the C++ language analyzer for semantic input tracing
|
Package cpp implements the C++ language analyzer for semantic input tracing |
|
csharp
Package csharp implements the C# language analyzer for semantic input tracing
|
Package csharp implements the C# language analyzer for semantic input tracing |
|
golang
Package golang implements the Go language analyzer for semantic input tracing
|
Package golang implements the Go language analyzer for semantic input tracing |
|
java
Package java implements the Java language analyzer for semantic input tracing
|
Package java implements the Java language analyzer for semantic input tracing |
|
javascript
Package javascript implements the JavaScript language analyzer for semantic input tracing
|
Package javascript implements the JavaScript language analyzer for semantic input tracing |
|
php
Package php implements the PHP language analyzer for semantic input tracing
|
Package php implements the PHP language analyzer for semantic input tracing |
|
python
Package python implements the Python language analyzer for semantic input tracing
|
Package python implements the Python language analyzer for semantic input tracing |
|
ruby
Package ruby implements the Ruby language analyzer for semantic input tracing
|
Package ruby implements the Ruby language analyzer for semantic input tracing |
|
rust
Package rust implements the Rust language analyzer for semantic input tracing
|
Package rust implements the Rust language analyzer for semantic input tracing |
|
typescript
Package typescript implements the TypeScript language analyzer for semantic input tracing
|
Package typescript implements the TypeScript language analyzer for semantic input tracing |
|
Package batch provides batch analysis capabilities for analyzing multiple code snippets
|
Package batch provides batch analysis capabilities for analyzing multiple code snippets |
|
Package callgraph provides sophisticated call graph management with distance computation for input flow analysis.
|
Package callgraph provides sophisticated call graph management with distance computation for input flow analysis. |
|
Package classifier provides snippet classification using carrier maps
|
Package classifier provides snippet classification using carrier maps |
|
Package condition provides key condition extraction for branch analysis.
|
Package condition provides key condition extraction for branch analysis. |
|
Package discovery - carrier map builder and serialization
|
Package discovery - carrier map builder and serialization |
|
Package extractor provides utilities to extract traceable PHP expressions from code snippets
|
Package extractor provides utilities to extract traceable PHP expressions from code snippets |
|
Package index provides a unified code indexer with signature-based lookup, inspired by ATLANTIS's multi-tier code retrieval approach.
|
Package index provides a unified code indexer with signature-based lookup, inspired by ATLANTIS's multi-tier code retrieval approach. |
|
Package pathanalysis provides inter-procedural path expansion and pruning for taint analysis.
|
Package pathanalysis provides inter-procedural path expansion and pruning for taint analysis. |
|
Package symbolic provides symbolic execution for deep semantic tracing This traces object instantiation, constructor execution, method calls, and property population Works universally across ALL PHP applications - no framework-specific hints
|
Package symbolic provides symbolic execution for deep semantic tracing This traces object instantiation, constructor execution, method calls, and property population Works universally across ALL PHP applications - no framework-specific hints |
|
Package tracer provides variable tracing across codebases
|
Package tracer provides variable tracing across codebases |
|
Package types defines universal data structures for semantic input tracing across all supported programming languages.
|
Package types defines universal data structures for semantic input tracing across all supported programming languages. |