analyzer

package
v1.16.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 14, 2026 License: MIT Imports: 24 Imported by: 0

Documentation

Index

Constants

View Source
const (
	LabelFunctionBody = "func_body"
	LabelClassBody    = "class_body"
	LabelUnreachable  = "unreachable"
	LabelMainModule   = "main"
	LabelEntry        = "ENTRY"
	LabelExit         = "EXIT"

	// Loop-related labels
	LabelLoopHeader = "loop_header"
	LabelLoopBody   = "loop_body"
	LabelLoopExit   = "loop_exit"
	LabelLoopElse   = "loop_else"

	// Exception-related labels
	LabelTryBlock     = "try_block"
	LabelExceptBlock  = "except_block"
	LabelFinallyBlock = "finally_block"
	LabelTryElse      = "try_else"

	// Advanced construct labels
	LabelWithSetup    = "with_setup"
	LabelWithBody     = "with_body"
	LabelWithTeardown = "with_teardown"
	LabelMatchCase    = "match_case"
	LabelMatchMerge   = "match_merge"
)

Block label constants to avoid magic strings

Variables

This section is empty.

Functions

func CalculateCouplingMetrics

func CalculateCouplingMetrics(graph *DependencyGraph, options *CouplingMetricsOptions) error

CalculateCouplingMetrics is a convenience function for calculating metrics

func CalculateDIAntipatterns added in v1.16.0

func CalculateDIAntipatterns(ast *parser.Node, filePath string) ([]domain.DIAntipatternFinding, error)

CalculateDIAntipatterns is a convenience function for detecting DI anti-patterns with default options

func CalculateDIAntipatternsWithConfig added in v1.16.0

func CalculateDIAntipatternsWithConfig(ast *parser.Node, filePath string, options *DIAntipatternOptions) ([]domain.DIAntipatternFinding, error)

CalculateDIAntipatternsWithConfig detects DI anti-patterns with custom configuration

func CalculateFilesCBO

func CalculateFilesCBO(asts map[string]*parser.Node, options *CBOOptions) (map[string][]*CBOResult, error)

CalculateFilesCBO calculates CBO for multiple files

func ComputeKeyRoots

func ComputeKeyRoots(root *TreeNode) []int

ComputeKeyRoots identifies key roots for path decomposition

func ComputeLeftMostLeaves

func ComputeLeftMostLeaves(root *TreeNode)

ComputeLeftMostLeaves computes left-most leaf descendants for all nodes

func ExtractAttributeName added in v1.16.0

func ExtractAttributeName(attrNode *parser.Node) string

ExtractAttributeName extracts the attribute/method name from an Attribute node. This is a shared helper used by concrete dependency and service locator detectors.

func FindClassMethods added in v1.16.0

func FindClassMethods(classNode *parser.Node) []*parser.Node

FindClassMethods finds all methods (functions) defined in a class body. This is a shared helper for DI anti-pattern detectors.

func FindInitMethod added in v1.16.0

func FindInitMethod(classNode *parser.Node) *parser.Node

FindInitMethod finds the __init__ method in a class body. Returns nil if no __init__ method is found. This is a shared helper for DI anti-pattern detectors.

func GenerateSummary added in v1.16.0

func GenerateSummary(findings []domain.DIAntipatternFinding, filesAnalyzed int) domain.DIAntipatternSummary

GenerateSummary generates summary statistics from findings

func GetCycleBreakingSuggestions

func GetCycleBreakingSuggestions(result *CircularDependencyResult) []string

GetCycleBreakingSuggestions suggests module refactoring to break cycles

func GroupFindingsByReason

func GroupFindingsByReason(findings []*DeadCodeFinding) map[DeadCodeReason][]*DeadCodeFinding

GroupFindingsByReason groups findings by their reason

func HasCircularDependencies

func HasCircularDependencies(graph *DependencyGraph) bool

HasCircularDependencies quickly checks if a graph has any circular dependencies

func IsBoilerplateLabel added in v1.9.2

func IsBoilerplateLabel(label string) bool

IsBoilerplateLabel checks if a tree node label represents boilerplate code. Boilerplate includes type annotations, decorators, and type hint related nodes. This is the single source of truth for boilerplate detection, used by both the cost model and any other components that need to identify boilerplate.

func PopulateLogicalLines added in v1.15.0

func PopulateLogicalLines(result *RawMetricsResult, ast *parser.Node)

PopulateLogicalLines updates raw metrics with LLOC derived from the parsed AST.

func PostOrderTraversal

func PostOrderTraversal(root *TreeNode)

PostOrderTraversal performs post-order traversal and assigns post-order IDs

func PrepareTreeForAPTED

func PrepareTreeForAPTED(root *TreeNode) []int

PrepareTreeForAPTED prepares a tree for APTED algorithm by computing all necessary indices

func SortFindings added in v1.16.0

func SortFindings(findings []domain.DIAntipatternFinding, sortBy domain.SortCriteria) []domain.DIAntipatternFinding

SortFindings sorts findings by the specified criteria

Types

type APTEDAnalyzer

type APTEDAnalyzer struct {
	// contains filtered or unexported fields
}

APTEDAnalyzer implements the APTED (All Path Tree Edit Distance) algorithm Based on Pawlik & Augsten's optimal O(n² log n) algorithm

func NewAPTEDAnalyzer

func NewAPTEDAnalyzer(costModel CostModel) *APTEDAnalyzer

NewAPTEDAnalyzer creates a new APTED analyzer with the given cost model

func (*APTEDAnalyzer) BatchComputeDistances

func (a *APTEDAnalyzer) BatchComputeDistances(pairs [][2]*TreeNode) []float64

BatchComputeDistances computes distances between multiple tree pairs efficiently

func (*APTEDAnalyzer) ClusterSimilarTrees

func (a *APTEDAnalyzer) ClusterSimilarTrees(trees []*TreeNode, similarityThreshold float64) *ClusterResult

ClusterSimilarTrees clusters trees based on similarity threshold

func (*APTEDAnalyzer) ComputeDetailedDistance

func (a *APTEDAnalyzer) ComputeDetailedDistance(tree1, tree2 *TreeNode) *TreeEditResult

ComputeDetailedDistance computes detailed tree edit distance information

func (*APTEDAnalyzer) ComputeDistance

func (a *APTEDAnalyzer) ComputeDistance(tree1, tree2 *TreeNode) float64

ComputeDistance computes the tree edit distance between two trees

func (*APTEDAnalyzer) ComputeSimilarity

func (a *APTEDAnalyzer) ComputeSimilarity(tree1, tree2 *TreeNode) float64

ComputeSimilarity computes similarity score between two trees (0.0 to 1.0)

type ASTFeatureExtractor

type ASTFeatureExtractor struct {
	// contains filtered or unexported fields
}

ASTFeatureExtractor implements FeatureExtractor for TreeNode

func NewASTFeatureExtractor

func NewASTFeatureExtractor() *ASTFeatureExtractor

NewASTFeatureExtractor creates a feature extractor with sensible defaults

func (*ASTFeatureExtractor) ExtractFeatures

func (a *ASTFeatureExtractor) ExtractFeatures(ast *TreeNode) ([]string, error)

ExtractFeatures builds a mixed set of features from the tree - Subtree hashes (bottom-up) up to maxSubtreeHeight - k-grams from pre-order traversal labels - Node type presence and lightweight distribution markers - Structural pattern tokens

func (*ASTFeatureExtractor) ExtractNodeSequences

func (a *ASTFeatureExtractor) ExtractNodeSequences(ast *TreeNode, k int) []string

ExtractNodeSequences returns k-grams from pre-order traversal labels

func (*ASTFeatureExtractor) ExtractSubtreeHashes

func (a *ASTFeatureExtractor) ExtractSubtreeHashes(ast *TreeNode, maxHeight int) []string

ExtractSubtreeHashes computes bottom-up hashes of subtrees up to maxHeight

func (*ASTFeatureExtractor) WithOptions

func (a *ASTFeatureExtractor) WithOptions(maxHeight, k int, includeTypes, includeLiterals bool) *ASTFeatureExtractor

WithOptions allows overriding defaults

type AggregateComplexity

type AggregateComplexity struct {
	TotalFunctions    int
	AverageComplexity float64
	MaxComplexity     int
	MinComplexity     int
	HighRiskCount     int
	MediumRiskCount   int
	LowRiskCount      int
}

AggregateComplexity calculates aggregate metrics for multiple functions

func CalculateAggregateComplexity

func CalculateAggregateComplexity(results []*ComplexityResult) *AggregateComplexity

CalculateAggregateComplexity computes aggregate complexity metrics

type AggregateRawMetrics added in v1.15.0

type AggregateRawMetrics struct {
	FilesAnalyzed  int
	SLOC           int
	LLOC           int
	CommentLines   int
	DocstringLines int
	BlankLines     int
	TotalLines     int
	CommentRatio   float64
}

AggregateRawMetrics contains aggregated raw code metrics across files.

func CalculateAggregateRawMetrics added in v1.15.0

func CalculateAggregateRawMetrics(results []*RawMetricsResult) *AggregateRawMetrics

CalculateAggregateRawMetrics aggregates raw code metrics across files.

type BasicBlock

type BasicBlock struct {
	// ID is the unique identifier for this block
	ID string

	// Statements contains the AST nodes in this block
	Statements []*parser.Node

	// Predecessors are blocks that can flow into this block
	Predecessors []*Edge

	// Successors are blocks that this block can flow to
	Successors []*Edge

	// Label is an optional human-readable label
	Label string

	// IsEntry indicates if this is an entry block
	IsEntry bool

	// IsExit indicates if this is an exit block
	IsExit bool
}

BasicBlock represents a basic block in the control flow graph

func NewBasicBlock

func NewBasicBlock(id string) *BasicBlock

NewBasicBlock creates a new basic block with the given ID

func (*BasicBlock) AddStatement

func (bb *BasicBlock) AddStatement(stmt *parser.Node)

AddStatement adds an AST node to this block

func (*BasicBlock) AddSuccessor

func (bb *BasicBlock) AddSuccessor(to *BasicBlock, edgeType EdgeType) *Edge

AddSuccessor adds an outgoing edge to another block

func (*BasicBlock) IsEmpty

func (bb *BasicBlock) IsEmpty() bool

IsEmpty returns true if the block has no statements

func (*BasicBlock) RemoveSuccessor

func (bb *BasicBlock) RemoveSuccessor(to *BasicBlock)

RemoveSuccessor removes an edge to the specified block

func (*BasicBlock) String

func (bb *BasicBlock) String() string

String returns a string representation of the basic block

type CBOAnalyzer

type CBOAnalyzer struct {
	// contains filtered or unexported fields
}

CBOAnalyzer analyzes class coupling in Python code

func NewCBOAnalyzer

func NewCBOAnalyzer(options *CBOOptions) *CBOAnalyzer

NewCBOAnalyzer creates a new CBO analyzer

func (*CBOAnalyzer) AnalyzeClasses

func (a *CBOAnalyzer) AnalyzeClasses(ast *parser.Node, filePath string) ([]*CBOResult, error)

AnalyzeClasses analyzes CBO for all classes in the given AST

type CBOOptions

type CBOOptions struct {
	IncludeBuiltins   bool
	IncludeImports    bool
	PublicClassesOnly bool
	ExcludePatterns   []string
	LowThreshold      int // Default: 3 (industry standard)
	MediumThreshold   int // Default: 7 (industry standard)
}

CBOOptions configures CBO analysis behavior

func DefaultCBOOptions

func DefaultCBOOptions() *CBOOptions

DefaultCBOOptions returns default CBO analysis options Threshold values are sourced from domain/defaults.go

type CBOResult

type CBOResult struct {
	// Core CBO metric
	CouplingCount int

	// Class information
	ClassName string
	FilePath  string
	StartLine int
	EndLine   int

	// Dependency breakdown
	InheritanceDependencies     int
	TypeHintDependencies        int
	InstantiationDependencies   int
	AttributeAccessDependencies int
	ImportDependencies          int

	// Detailed dependency list
	DependentClasses []string

	// Risk assessment
	RiskLevel string // "low", "medium", "high"

	// Additional class metadata
	IsAbstract  bool
	BaseClasses []string
	Methods     []string
	Attributes  []string
}

CBOResult holds CBO (Coupling Between Objects) metrics for a class

func CalculateCBO

func CalculateCBO(ast *parser.Node, filePath string) ([]*CBOResult, error)

CalculateCBO is a convenience function for calculating CBO with default config

func CalculateCBOWithConfig

func CalculateCBOWithConfig(ast *parser.Node, filePath string, options *CBOOptions) ([]*CBOResult, error)

CalculateCBOWithConfig calculates CBO with custom configuration

type CFG

type CFG struct {
	// Entry is the entry point of the graph
	Entry *BasicBlock

	// Exit is the exit point of the graph
	Exit *BasicBlock

	// Blocks contains all blocks in the graph, indexed by ID
	Blocks map[string]*BasicBlock

	// Name is the name of the CFG (e.g., function name)
	Name string

	// FunctionNode is the original AST node for the function
	FunctionNode *parser.Node
	// contains filtered or unexported fields
}

CFG represents a control flow graph

func NewCFG

func NewCFG(name string) *CFG

NewCFG creates a new control flow graph

func (*CFG) AddBlock

func (cfg *CFG) AddBlock(block *BasicBlock)

AddBlock adds an existing block to the graph

func (*CFG) BreadthFirstWalk

func (cfg *CFG) BreadthFirstWalk(visitor CFGVisitor)

BreadthFirstWalk performs a breadth-first traversal of the CFG

func (*CFG) ConnectBlocks

func (cfg *CFG) ConnectBlocks(from, to *BasicBlock, edgeType EdgeType) *Edge

ConnectBlocks creates an edge between two blocks

func (*CFG) CreateBlock

func (cfg *CFG) CreateBlock(label string) *BasicBlock

CreateBlock creates a new basic block and adds it to the graph

func (*CFG) GetBlock

func (cfg *CFG) GetBlock(id string) *BasicBlock

GetBlock retrieves a block by its ID

func (*CFG) RemoveBlock

func (cfg *CFG) RemoveBlock(block *BasicBlock)

RemoveBlock removes a block from the graph

func (*CFG) Size

func (cfg *CFG) Size() int

Size returns the number of blocks in the graph

func (*CFG) String

func (cfg *CFG) String() string

String returns a string representation of the CFG

func (*CFG) Walk

func (cfg *CFG) Walk(visitor CFGVisitor)

Walk performs a depth-first traversal of the CFG

type CFGBuilder

type CFGBuilder struct {
	// contains filtered or unexported fields
}

CFGBuilder builds control flow graphs from AST nodes

func NewCFGBuilder

func NewCFGBuilder() *CFGBuilder

NewCFGBuilder creates a new CFG builder

func (*CFGBuilder) Build

func (b *CFGBuilder) Build(node *parser.Node) (*CFG, error)

Build constructs a CFG from an AST node

func (*CFGBuilder) BuildAll

func (b *CFGBuilder) BuildAll(node *parser.Node) (map[string]*CFG, error)

BuildAll builds CFGs for all functions in the AST

func (*CFGBuilder) SetLogger

func (b *CFGBuilder) SetLogger(logger *log.Logger)

SetLogger sets an optional logger for error reporting

type CFGFeatures added in v1.5.0

type CFGFeatures struct {
	BlockCount       int              // Number of basic blocks
	EdgeCount        int              // Number of edges
	EdgeTypeCounts   map[EdgeType]int // Distribution of edge types
	CyclomaticNumber int              // Cyclomatic complexity: V(G) = E - N + 2P
	BranchingFactor  float64          // Average number of successors per block
	LoopEdgeCount    int              // Number of loop back-edges
	ConditionalCount int              // Number of conditional branches
}

CFGFeatures captures key structural properties of a control flow graph

type CFGVisitor

type CFGVisitor interface {
	// VisitBlock is called for each basic block
	// Returns false to stop traversal
	VisitBlock(block *BasicBlock) bool

	// VisitEdge is called for each edge
	// Returns false to stop traversal
	VisitEdge(edge *Edge) bool
}

CFGVisitor defines the interface for visiting CFG nodes

type CentroidGrouping

type CentroidGrouping struct {
	// contains filtered or unexported fields
}

CentroidGrouping implements centroid-based grouping that avoids transitive problems This strategy uses BFS to grow groups while directly comparing candidates to existing members, avoiding the transitive similarity issue (A↔B↔C where A and C are dissimilar).

func NewCentroidGrouping

func NewCentroidGrouping(threshold float64) *CentroidGrouping

NewCentroidGrouping creates a new centroid-based grouping strategy

func (*CentroidGrouping) GetName

func (c *CentroidGrouping) GetName() string

func (*CentroidGrouping) GroupClones

func (c *CentroidGrouping) GroupClones(pairs []*ClonePair) []*CloneGroup

GroupClones groups clones using centroid-based approach

func (*CentroidGrouping) SetThresholds

func (c *CentroidGrouping) SetThresholds(type1, type2, type3, type4 float64)

SetThresholds sets the clone type thresholds for classification

type CircularDependency

type CircularDependency struct {
	Modules      []string          // Modules involved in the cycle
	Dependencies []DependencyChain // The dependency chains that form the cycle
	Severity     CycleSeverity     // Severity level of this cycle
	Size         int               // Number of modules in the cycle
	Description  string            // Human-readable description
}

CircularDependency represents a circular dependency relationship

func FindSimpleCycles

func FindSimpleCycles(graph *DependencyGraph) []*CircularDependency

FindSimpleCycles finds all simple cycles (2-module cycles) in the graph

type CircularDependencyDetector

type CircularDependencyDetector struct {
	// contains filtered or unexported fields
}

CircularDependencyDetector detects circular dependencies using Tarjan's algorithm

func NewCircularDependencyDetector

func NewCircularDependencyDetector(graph *DependencyGraph) *CircularDependencyDetector

NewCircularDependencyDetector creates a new circular dependency detector

func (*CircularDependencyDetector) DetectCircularDependencies

func (cdd *CircularDependencyDetector) DetectCircularDependencies() *CircularDependencyResult

DetectCircularDependencies detects all circular dependencies in the graph

type CircularDependencyResult

type CircularDependencyResult struct {
	HasCircularDependencies bool                  // True if any cycles were found
	TotalCycles             int                   // Total number of cycles detected
	TotalModulesInCycles    int                   // Total number of modules involved in cycles
	CircularDependencies    []*CircularDependency // All detected circular dependencies

	// Severity breakdown
	LowSeverityCycles      int // Number of low severity cycles
	MediumSeverityCycles   int // Number of medium severity cycles
	HighSeverityCycles     int // Number of high severity cycles
	CriticalSeverityCycles int // Number of critical severity cycles

	// Most problematic cycles
	LargestCycle       *CircularDependency // Cycle with most modules
	MostComplexCycle   *CircularDependency // Cycle with most dependency chains
	CoreInfrastructure []string            // Modules that appear in multiple cycles
}

CircularDependencyResult contains the results of circular dependency analysis

func DetectCircularDependencies

func DetectCircularDependencies(graph *DependencyGraph) *CircularDependencyResult

DetectCircularDependencies is a convenience function for detecting cycles in a graph

type ClassificationResult added in v1.5.0

type ClassificationResult struct {
	CloneType  CloneType
	Similarity float64
	Confidence float64
	Analyzer   string
}

ClassificationResult holds the result of clone classification

type CloneClassifier added in v1.5.0

type CloneClassifier struct {
	// contains filtered or unexported fields
}

CloneClassifier orchestrates multi-dimensional clone classification. It uses different analyzers for each clone type and applies a cascading classification approach from fastest (Type-1) to slowest (Type-4).

func NewCloneClassifier added in v1.5.0

func NewCloneClassifier(config *CloneClassifierConfig) *CloneClassifier

NewCloneClassifier creates a new multi-dimensional clone classifier

func (*CloneClassifier) ClassifyClone added in v1.5.0

func (c *CloneClassifier) ClassifyClone(f1, f2 *CodeFragment) *ClassificationResult

ClassifyClone determines the clone type using cascading analysis. It returns the clone type, similarity score, and confidence. Classification order: Type-1 (fastest) -> Type-2 -> Type-3 -> Type-4 (slowest)

func (*CloneClassifier) ClassifyCloneSimple added in v1.5.0

func (c *CloneClassifier) ClassifyCloneSimple(f1, f2 *CodeFragment) (CloneType, float64, float64)

ClassifyCloneSimple is a simplified version that returns just CloneType, similarity, and confidence. This is for backward compatibility with existing code.

func (*CloneClassifier) SetSemanticAnalyzer added in v1.5.0

func (c *CloneClassifier) SetSemanticAnalyzer(analyzer SimilarityAnalyzer)

SetSemanticAnalyzer sets the semantic similarity analyzer (for testing)

func (*CloneClassifier) SetStructuralAnalyzer added in v1.5.0

func (c *CloneClassifier) SetStructuralAnalyzer(analyzer SimilarityAnalyzer)

SetStructuralAnalyzer sets the structural similarity analyzer (for testing)

func (*CloneClassifier) SetSyntacticAnalyzer added in v1.5.0

func (c *CloneClassifier) SetSyntacticAnalyzer(analyzer SimilarityAnalyzer)

SetSyntacticAnalyzer sets the syntactic similarity analyzer (for testing)

func (*CloneClassifier) SetTextualAnalyzer added in v1.5.0

func (c *CloneClassifier) SetTextualAnalyzer(analyzer SimilarityAnalyzer)

SetTextualAnalyzer sets the textual similarity analyzer (for testing)

type CloneClassifierConfig added in v1.5.0

type CloneClassifierConfig struct {
	Type1Threshold         float64
	Type2Threshold         float64
	Type3Threshold         float64
	Type4Threshold         float64
	EnableTextualAnalysis  bool
	EnableSemanticAnalysis bool
	EnableDFAAnalysis      bool // Enable Data Flow Analysis for enhanced Type-4 detection
}

CloneClassifierConfig holds configuration for the clone classifier

type CloneDetector

type CloneDetector struct {
	// contains filtered or unexported fields
}

CloneDetector detects code clones using APTED algorithm

func NewCloneDetector

func NewCloneDetector(config *CloneDetectorConfig) *CloneDetector

NewCloneDetector creates a new clone detector with the given configuration

func (*CloneDetector) DetectClones

func (cd *CloneDetector) DetectClones(fragments []*CodeFragment) ([]*ClonePair, []*CloneGroup)

DetectClones detects clones in the given code fragments

func (*CloneDetector) DetectClonesWithContext

func (cd *CloneDetector) DetectClonesWithContext(ctx context.Context, fragments []*CodeFragment) ([]*ClonePair, []*CloneGroup)

DetectClonesWithContext detects clones with context support for cancellation

func (*CloneDetector) DetectClonesWithLSH

func (cd *CloneDetector) DetectClonesWithLSH(ctx context.Context, fragments []*CodeFragment) ([]*ClonePair, []*CloneGroup)

DetectClonesWithLSH runs a two-stage pipeline using LSH for candidate generation, followed by APTED verification on candidates only. Falls back to exhaustive if misconfigured.

func (*CloneDetector) ExtractFragments

func (cd *CloneDetector) ExtractFragments(astNodes []*parser.Node, filePath string) []*CodeFragment

ExtractFragments extracts code fragments from AST nodes

func (*CloneDetector) ExtractFragmentsWithSource added in v1.5.0

func (cd *CloneDetector) ExtractFragmentsWithSource(astNodes []*parser.Node, filePath string, sourceCode []byte) []*CodeFragment

ExtractFragmentsWithSource extracts code fragments from AST nodes with source content. Use this method when EnableTextualAnalysis is true to populate CodeFragment.Content.

func (*CloneDetector) GetStatistics

func (cd *CloneDetector) GetStatistics() map[string]interface{}

GetStatistics returns clone detection statistics

func (*CloneDetector) SetBatchSizeLarge

func (cd *CloneDetector) SetBatchSizeLarge(size int)

SetBatchSizeLarge sets the batch size for normal projects (used in testing)

func (*CloneDetector) SetUseLSH

func (cd *CloneDetector) SetUseLSH(enabled bool)

SetUseLSH enables or disables LSH acceleration for clone detection

type CloneDetectorConfig

type CloneDetectorConfig struct {
	// Minimum number of lines for a code fragment to be considered
	MinLines int

	// Minimum number of AST nodes for a code fragment
	MinNodes int

	// Similarity thresholds for different clone types
	Type1Threshold float64 // Usually > domain.DefaultType1CloneThreshold
	Type2Threshold float64 // Usually > domain.DefaultType2CloneThreshold
	Type3Threshold float64 // Usually > domain.DefaultType3CloneThreshold
	Type4Threshold float64 // Usually > domain.DefaultType4CloneThreshold

	// Minimum similarity threshold for clone reporting (user-configurable via --clone-threshold)
	SimilarityThreshold float64

	// Maximum edit distance allowed
	MaxEditDistance float64

	// Whether to ignore differences in literals
	IgnoreLiterals bool

	// Whether to ignore differences in identifiers
	IgnoreIdentifiers bool

	// Whether to skip docstrings from AST comparison (default: true)
	// Docstrings are the first Expr(Constant(str)) in function/class/module bodies
	SkipDocstrings bool

	// Cost model to use for APTED
	CostModelType string // "default", "python", "weighted"

	// Performance tuning parameters
	MaxClonePairs      int // Maximum pairs to keep in memory
	BatchSizeThreshold int // Minimum fragments to trigger batching
	BatchSizeLarge     int // Batch size for normal projects
	BatchSizeSmall     int // Batch size for large projects
	LargeProjectSize   int // Fragment count threshold for large projects

	// Grouping configuration
	GroupingMode      GroupingMode // デフォルト: GroupingModeConnected
	GroupingThreshold float64      // デフォルト: Type3Threshold
	KCoreK            int          // デフォルト: 2

	// LSH Configuration (optional, opt-in)
	UseLSH                 bool    // Enable LSH acceleration
	LSHSimilarityThreshold float64 // Candidate threshold using MinHash similarity
	LSHBands               int     // Number of LSH bands (default: 32)
	LSHRows                int     // Rows per band (default: 4)
	LSHMinHashCount        int     // Number of MinHash functions (default: 128)

	// Multi-dimensional classification (optional, opt-in)
	EnableMultiDimensionalAnalysis bool // Enable multi-dimensional clone type classification
	EnableTextualAnalysis          bool // Enable Type-1 textual analysis (increases memory usage)
	EnableSemanticAnalysis         bool // Enable Type-4 semantic/CFG analysis (increases CPU usage)
	EnableDFAAnalysis              bool // Enable Data Flow Analysis for enhanced Type-4 detection

	// Framework pattern handling (reduces false positives for dataclass, Pydantic, etc.)
	ReduceBoilerplateSimilarity bool    // Apply lower weight to boilerplate nodes (default: true)
	BoilerplateMultiplier       float64 // Cost multiplier for boilerplate nodes (default: 0.1)
}

CloneDetectorConfig holds configuration for clone detection

func DefaultCloneDetectorConfig

func DefaultCloneDetectorConfig() *CloneDetectorConfig

DefaultCloneDetectorConfig returns default configuration

type CloneGroup

type CloneGroup struct {
	ID         int             // Unique identifier for this group
	Fragments  []*CodeFragment // All fragments in this group
	CloneType  CloneType       // Primary type of clones in this group
	Similarity float64         // Average similarity within the group
	Size       int             // Number of fragments
}

CloneGroup represents a group of similar code fragments

func NewCloneGroup

func NewCloneGroup(id int) *CloneGroup

NewCloneGroup creates a new clone group

func (*CloneGroup) AddFragment

func (cg *CloneGroup) AddFragment(fragment *CodeFragment)

AddFragment adds a fragment to the clone group

type ClonePair

type ClonePair struct {
	Fragment1  *CodeFragment
	Fragment2  *CodeFragment
	Similarity float64   // Similarity score (0.0 to 1.0)
	Distance   float64   // Edit distance
	CloneType  CloneType // Type of clone detected
	Confidence float64   // Confidence in the detection (0.0 to 1.0)
}

ClonePair represents a pair of similar code fragments

func (*ClonePair) String

func (cp *ClonePair) String() string

String returns string representation of ClonePair

type CloneType

type CloneType int

CloneType represents different types of code clones

const (
	// Type1Clone - Identical code fragments (except whitespace and comments)
	Type1Clone CloneType = iota + 1
	// Type2Clone - Syntactically identical but with different identifiers/literals
	Type2Clone
	// Type3Clone - Syntactically similar with small modifications
	Type3Clone
	// Type4Clone - Functionally similar but syntactically different
	Type4Clone
)

func (CloneType) String

func (ct CloneType) String() string

String returns string representation of CloneType

type ClusterResult

type ClusterResult struct {
	Groups    [][]int     // Groups of tree indices that are similar
	Distances [][]float64 // Distance matrix between all trees
	Threshold float64     // Similarity threshold used
}

ClusterResult represents the result of tree clustering

type CodeFragment

type CodeFragment struct {
	Location   *CodeLocation
	ASTNode    *parser.Node
	TreeNode   *TreeNode
	Content    string   // Original source code content
	Hash       string   // Hash for quick comparison
	Size       int      // Number of AST nodes
	LineCount  int      // Number of source lines
	Complexity int      // Cyclomatic complexity (if applicable)
	Features   []string // Pre-computed features for Jaccard similarity
}

CodeFragment represents a fragment of code

func NewCodeFragment

func NewCodeFragment(location *CodeLocation, astNode *parser.Node, content string) *CodeFragment

NewCodeFragment creates a new code fragment

type CodeLocation

type CodeLocation struct {
	FilePath  string
	StartLine int
	EndLine   int
	StartCol  int
	EndCol    int
}

CodeLocation represents a location in source code

func (*CodeLocation) String

func (cl *CodeLocation) String() string

String returns string representation of CodeLocation

type CognitiveComplexityResult added in v1.14.0

type CognitiveComplexityResult struct {
	// Total cognitive complexity score
	Total int

	// Function/method information
	FunctionName string
	StartLine    int
	EndLine      int
}

CognitiveComplexityResult holds the cognitive complexity score for a function

func CalculateCognitiveComplexity added in v1.14.0

func CalculateCognitiveComplexity(funcNode *parser.Node) *CognitiveComplexityResult

CalculateCognitiveComplexity computes the cognitive complexity for a function node following the SonarSource specification.

Rules:

  • +1 (base increment) for: if, elif, else, for, while, except, break, continue, goto, ternary (IfExp), and each boolean operator sequence change
  • +nesting level (nesting increment) for: if, ternary (IfExp), for, while, except, match/case, nested functions/lambdas (structures that increase nesting)
  • Nesting level increases inside: if, elif, else, for, while, except, with, match/case, lambda, nested function/class definitions

type CompleteLinkageGrouping

type CompleteLinkageGrouping struct {
	// contains filtered or unexported fields
}

CompleteLinkageGrouping ensures all pairs in a group meet the threshold

func NewCompleteLinkageGrouping

func NewCompleteLinkageGrouping(threshold float64) *CompleteLinkageGrouping

func (*CompleteLinkageGrouping) GetName

func (c *CompleteLinkageGrouping) GetName() string

func (*CompleteLinkageGrouping) GroupClones

func (c *CompleteLinkageGrouping) GroupClones(pairs []*ClonePair) []*CloneGroup

type ComplexityAnalyzer

type ComplexityAnalyzer struct {
	// contains filtered or unexported fields
}

ComplexityAnalyzer provides high-level complexity analysis functionality

func NewComplexityAnalyzer

func NewComplexityAnalyzer(cfg *config.Config, output io.Writer) (*ComplexityAnalyzer, error)

NewComplexityAnalyzer creates a new complexity analyzer with configuration

func NewComplexityAnalyzerWithDefaults

func NewComplexityAnalyzerWithDefaults(output io.Writer) (*ComplexityAnalyzer, error)

NewComplexityAnalyzerWithDefaults creates a new analyzer with default configuration

func (*ComplexityAnalyzer) AnalyzeAndReport

func (ca *ComplexityAnalyzer) AnalyzeAndReport(cfgs []*CFG) error

AnalyzeAndReport performs complexity analysis and generates a formatted report

func (*ComplexityAnalyzer) AnalyzeFunction

func (ca *ComplexityAnalyzer) AnalyzeFunction(cfg *CFG) *ComplexityResult

AnalyzeFunction analyzes a single function and returns the result

func (*ComplexityAnalyzer) AnalyzeFunctions

func (ca *ComplexityAnalyzer) AnalyzeFunctions(cfgs []*CFG) []*ComplexityResult

AnalyzeFunctions analyzes multiple functions and returns filtered results

func (*ComplexityAnalyzer) CheckComplexityLimits

func (ca *ComplexityAnalyzer) CheckComplexityLimits(cfgs []*CFG) (bool, []*ComplexityResult)

CheckComplexityLimits checks if any functions exceed the configured maximum complexity Returns true if all functions are within limits, false otherwise

func (*ComplexityAnalyzer) GenerateReport

func (ca *ComplexityAnalyzer) GenerateReport(cfgs []*CFG) *reporter.ComplexityReport

GenerateReport creates a comprehensive report without outputting it

func (*ComplexityAnalyzer) GetConfiguration

func (ca *ComplexityAnalyzer) GetConfiguration() *config.Config

GetConfiguration returns the current configuration

func (*ComplexityAnalyzer) SetOutput

func (ca *ComplexityAnalyzer) SetOutput(output io.Writer) error

SetOutput changes the output destination for reports

func (*ComplexityAnalyzer) UpdateConfiguration

func (ca *ComplexityAnalyzer) UpdateConfiguration(cfg *config.Config) error

UpdateConfiguration updates the analyzer configuration

type ComplexityResult

type ComplexityResult struct {
	// McCabe cyclomatic complexity
	Complexity int

	// Raw CFG metrics
	Edges               int
	Nodes               int
	ConnectedComponents int

	// Function/method information
	FunctionName string
	StartLine    int
	StartCol     int
	EndLine      int

	// Nesting depth
	NestingDepth int

	// Decision points breakdown
	IfStatements      int
	LoopStatements    int
	ExceptionHandlers int
	SwitchCases       int

	// Risk assessment based on complexity thresholds
	RiskLevel string // "low", "medium", "high"
}

ComplexityResult holds cyclomatic complexity metrics for a function or method

func CalculateComplexity

func CalculateComplexity(cfg *CFG) *ComplexityResult

CalculateComplexity computes McCabe cyclomatic complexity for a CFG using default thresholds

func CalculateComplexityWithConfig

func CalculateComplexityWithConfig(cfg *CFG, complexityConfig *config.ComplexityConfig) *ComplexityResult

CalculateComplexityWithConfig computes McCabe cyclomatic complexity using provided configuration

func CalculateFileComplexity

func CalculateFileComplexity(cfgs []*CFG) []*ComplexityResult

CalculateFileComplexity calculates complexity for all functions in a collection of CFGs

func CalculateFileComplexityWithConfig

func CalculateFileComplexityWithConfig(cfgs []*CFG, complexityConfig *config.ComplexityConfig) []*ComplexityResult

CalculateFileComplexityWithConfig calculates complexity using provided configuration

func (*ComplexityResult) GetComplexity

func (cr *ComplexityResult) GetComplexity() int

func (*ComplexityResult) GetDetailedMetrics

func (cr *ComplexityResult) GetDetailedMetrics() map[string]int

func (*ComplexityResult) GetFunctionName

func (cr *ComplexityResult) GetFunctionName() string

func (*ComplexityResult) GetRiskLevel

func (cr *ComplexityResult) GetRiskLevel() string

func (*ComplexityResult) String

func (cr *ComplexityResult) String() string

String returns a human-readable representation of the complexity result

type ConcreteDependencyDetector added in v1.16.0

type ConcreteDependencyDetector struct {
	// contains filtered or unexported fields
}

ConcreteDependencyDetector detects concrete dependency anti-patterns

func NewConcreteDependencyDetector added in v1.16.0

func NewConcreteDependencyDetector() *ConcreteDependencyDetector

NewConcreteDependencyDetector creates a new concrete dependency detector

func (*ConcreteDependencyDetector) Analyze added in v1.16.0

Analyze detects concrete dependencies in the given AST

type ConnectedGrouping

type ConnectedGrouping struct {
	// contains filtered or unexported fields
}

ConnectedGrouping wraps the existing transitive grouping logic using Union-Find

func NewConnectedGrouping

func NewConnectedGrouping(threshold float64) *ConnectedGrouping

func (*ConnectedGrouping) GetName

func (c *ConnectedGrouping) GetName() string

func (*ConnectedGrouping) GroupClones

func (c *ConnectedGrouping) GroupClones(pairs []*ClonePair) []*CloneGroup

type ConstructorAnalyzer added in v1.16.0

type ConstructorAnalyzer struct {
	// contains filtered or unexported fields
}

ConstructorAnalyzer detects constructor over-injection anti-pattern

func NewConstructorAnalyzer added in v1.16.0

func NewConstructorAnalyzer(threshold int) *ConstructorAnalyzer

NewConstructorAnalyzer creates a new constructor analyzer

func (*ConstructorAnalyzer) Analyze added in v1.16.0

func (a *ConstructorAnalyzer) Analyze(ast *parser.Node, filePath string) []domain.DIAntipatternFinding

Analyze detects constructor over-injection in the given AST

type CostModel

type CostModel interface {
	// Insert returns the cost of inserting a node
	Insert(node *TreeNode) float64

	// Delete returns the cost of deleting a node
	Delete(node *TreeNode) float64

	// Rename returns the cost of renaming node1 to node2
	Rename(node1, node2 *TreeNode) float64
}

CostModel defines the interface for calculating edit operation costs

type CouplingMetricsCalculator

type CouplingMetricsCalculator struct {
	// contains filtered or unexported fields
}

CouplingMetricsCalculator calculates various coupling and quality metrics for modules

func NewCouplingMetricsCalculator

func NewCouplingMetricsCalculator(graph *DependencyGraph, options *CouplingMetricsOptions) *CouplingMetricsCalculator

NewCouplingMetricsCalculator creates a new coupling metrics calculator

func (*CouplingMetricsCalculator) CalculateMetrics

func (calc *CouplingMetricsCalculator) CalculateMetrics() error

CalculateMetrics calculates all metrics for the dependency graph

type CouplingMetricsOptions

type CouplingMetricsOptions struct {
	IncludeAbstractness bool               // Calculate abstractness metrics
	ComplexityData      map[string]int     // Complexity data from complexity analysis
	ClonesData          map[string]float64 // Clone data from clone analysis
	DeadCodeData        map[string]int     // Dead code data from dead code analysis
}

CouplingMetricsOptions configures metrics calculation

func DefaultCouplingMetricsOptions

func DefaultCouplingMetricsOptions() *CouplingMetricsOptions

DefaultCouplingMetricsOptions returns default options

type CycleSeverity

type CycleSeverity string

CycleSeverity represents the severity level of a circular dependency

const (
	CycleSeverityLow      CycleSeverity = "low"      // Simple 2-module cycles
	CycleSeverityMedium   CycleSeverity = "medium"   // 3-5 module cycles
	CycleSeverityHigh     CycleSeverity = "high"     // 6-10 module cycles
	CycleSeverityCritical CycleSeverity = "critical" // 10+ module cycles or core infrastructure
)

type DFABuilder added in v1.5.0

type DFABuilder struct {
	// contains filtered or unexported fields
}

DFABuilder constructs def-use chain information from a CFG

func NewDFABuilder added in v1.5.0

func NewDFABuilder() *DFABuilder

NewDFABuilder creates a new DFA builder

func (*DFABuilder) Build added in v1.5.0

func (b *DFABuilder) Build(cfg *CFG) (*DFAInfo, error)

Build creates DFA information for the given CFG

type DFAFeatures added in v1.5.0

type DFAFeatures struct {
	TotalDefs       int // Total number of definitions
	TotalUses       int // Total number of uses
	TotalPairs      int // Total number of def-use pairs
	UniqueVariables int // Number of unique variables

	AvgChainLength  float64 // Average uses per definition
	MaxChainLength  int     // Maximum def-use chain length
	CrossBlockPairs int     // Def-use pairs spanning blocks
	IntraBlockPairs int     // Def-use pairs within same block

	DefKindCounts map[DefUseKind]int // Distribution of definition kinds
	UseKindCounts map[DefUseKind]int // Distribution of use kinds
}

DFAFeatures captures data flow characteristics for clone comparison

func ExtractDFAFeatures added in v1.5.0

func ExtractDFAFeatures(info *DFAInfo) *DFAFeatures

ExtractDFAFeatures extracts DFA features from DFAInfo

func NewDFAFeatures added in v1.5.0

func NewDFAFeatures() *DFAFeatures

NewDFAFeatures creates a new DFA features instance

type DFAInfo added in v1.5.0

type DFAInfo struct {
	CFG       *CFG
	Chains    map[string]*DefUseChain    // Variable name -> chain
	BlockDefs map[string][]*VarReference // Block ID -> definitions in block
	BlockUses map[string][]*VarReference // Block ID -> uses in block
}

DFAInfo holds complete data flow information for a CFG

func NewDFAInfo added in v1.5.0

func NewDFAInfo(cfg *CFG) *DFAInfo

NewDFAInfo creates a new DFA info for a CFG

func (*DFAInfo) AddDef added in v1.5.0

func (info *DFAInfo) AddDef(ref *VarReference)

AddDef adds a definition to the DFA info

func (*DFAInfo) AddUse added in v1.5.0

func (info *DFAInfo) AddUse(ref *VarReference)

AddUse adds a use to the DFA info

func (*DFAInfo) GetChain added in v1.5.0

func (info *DFAInfo) GetChain(variable string) *DefUseChain

GetChain returns the def-use chain for a variable, creating one if needed

func (*DFAInfo) TotalDefs added in v1.5.0

func (info *DFAInfo) TotalDefs() int

TotalDefs returns the total number of definitions

func (*DFAInfo) TotalPairs added in v1.5.0

func (info *DFAInfo) TotalPairs() int

TotalPairs returns the total number of def-use pairs

func (*DFAInfo) TotalUses added in v1.5.0

func (info *DFAInfo) TotalUses() int

TotalUses returns the total number of uses

func (*DFAInfo) UniqueVariables added in v1.5.0

func (info *DFAInfo) UniqueVariables() int

UniqueVariables returns the number of unique variables

type DIAntipatternDetector added in v1.16.0

type DIAntipatternDetector struct {
	// contains filtered or unexported fields
}

DIAntipatternDetector coordinates all DI anti-pattern detectors

func NewDIAntipatternDetector added in v1.16.0

func NewDIAntipatternDetector(options *DIAntipatternOptions) *DIAntipatternDetector

NewDIAntipatternDetector creates a new DI anti-pattern detector

func (*DIAntipatternDetector) Analyze added in v1.16.0

func (d *DIAntipatternDetector) Analyze(ast *parser.Node, filePath string) ([]domain.DIAntipatternFinding, error)

Analyze runs all DI anti-pattern detectors on the given AST

type DIAntipatternOptions added in v1.16.0

type DIAntipatternOptions struct {
	ConstructorParamThreshold int
	MinSeverity               domain.DIAntipatternSeverity
}

DIAntipatternOptions configures DI anti-pattern detection

func DefaultDIAntipatternOptions added in v1.16.0

func DefaultDIAntipatternOptions() *DIAntipatternOptions

DefaultDIAntipatternOptions returns default options

type DeadCodeDetector

type DeadCodeDetector struct {
	// contains filtered or unexported fields
}

DeadCodeDetector provides high-level dead code detection functionality

func NewDeadCodeDetector

func NewDeadCodeDetector(cfg *CFG) *DeadCodeDetector

NewDeadCodeDetector creates a new dead code detector for the given CFG

func NewDeadCodeDetectorWithFilePath

func NewDeadCodeDetectorWithFilePath(cfg *CFG, filePath string) *DeadCodeDetector

NewDeadCodeDetectorWithFilePath creates a new dead code detector with file path context

func (*DeadCodeDetector) Detect

func (dcd *DeadCodeDetector) Detect() *DeadCodeResult

Detect performs dead code detection and returns structured findings

func (*DeadCodeDetector) GetDeadCodeRatio

func (dcd *DeadCodeDetector) GetDeadCodeRatio() float64

GetDeadCodeRatio returns the ratio of dead blocks to total blocks

func (*DeadCodeDetector) HasDeadCode

func (dcd *DeadCodeDetector) HasDeadCode() bool

HasDeadCode checks if the CFG contains any dead code

type DeadCodeFinding

type DeadCodeFinding struct {
	// Function information
	FunctionName string `json:"function_name"`
	FilePath     string `json:"file_path"`

	// Location information
	StartLine int `json:"start_line"`
	EndLine   int `json:"end_line"`

	// Dead code details
	BlockID     string         `json:"block_id"`
	Code        string         `json:"code"`
	Reason      DeadCodeReason `json:"reason"`
	Severity    SeverityLevel  `json:"severity"`
	Description string         `json:"description"`

	// Context information
	Context []string `json:"context,omitempty"`
}

DeadCodeFinding represents a single dead code detection result

func FilterFindingsBySeverity

func FilterFindingsBySeverity(findings []*DeadCodeFinding, minSeverity SeverityLevel) []*DeadCodeFinding

FilterFindingsBySeverity filters findings by minimum severity level

type DeadCodeReason

type DeadCodeReason string

DeadCodeReason represents the reason why code is considered dead

const (
	// ReasonUnreachableAfterReturn indicates code after a return statement
	ReasonUnreachableAfterReturn DeadCodeReason = "unreachable_after_return"

	// ReasonUnreachableAfterBreak indicates code after a break statement
	ReasonUnreachableAfterBreak DeadCodeReason = "unreachable_after_break"

	// ReasonUnreachableAfterContinue indicates code after a continue statement
	ReasonUnreachableAfterContinue DeadCodeReason = "unreachable_after_continue"

	// ReasonUnreachableAfterRaise indicates code after a raise statement
	ReasonUnreachableAfterRaise DeadCodeReason = "unreachable_after_raise"

	// ReasonUnreachableBranch indicates an unreachable branch condition
	ReasonUnreachableBranch DeadCodeReason = "unreachable_branch"

	// ReasonUnreachableAfterInfiniteLoop indicates code after an infinite loop
	ReasonUnreachableAfterInfiniteLoop DeadCodeReason = "unreachable_after_infinite_loop"
)

type DeadCodeResult

type DeadCodeResult struct {
	// Function information
	FunctionName string `json:"function_name"`
	FilePath     string `json:"file_path"`

	// Analysis results
	Findings       []*DeadCodeFinding `json:"findings"`
	TotalBlocks    int                `json:"total_blocks"`
	DeadBlocks     int                `json:"dead_blocks"`
	ReachableRatio float64            `json:"reachable_ratio"`

	// Performance metrics
	AnalysisTime time.Duration `json:"analysis_time"`
}

DeadCodeResult contains the results of dead code analysis for a single CFG

func DetectInFile

func DetectInFile(cfgs map[string]*CFG, filePath string) []*DeadCodeResult

DetectInFile analyzes multiple CFGs from a file and returns combined findings

func DetectInFunction

func DetectInFunction(cfg *CFG) *DeadCodeResult

DetectInFunction analyzes a single CFG and returns findings

func DetectInFunctionWithFilePath

func DetectInFunctionWithFilePath(cfg *CFG, filePath string) *DeadCodeResult

DetectInFunctionWithFilePath analyzes a single CFG with file path context

type DefUseChain added in v1.5.0

type DefUseChain struct {
	Variable string
	Defs     []*VarReference
	Uses     []*VarReference
	Pairs    []*DefUsePair
}

DefUseChain represents all def-use relationships for a variable

func NewDefUseChain added in v1.5.0

func NewDefUseChain(variable string) *DefUseChain

NewDefUseChain creates a new def-use chain for a variable

func (*DefUseChain) AddDef added in v1.5.0

func (c *DefUseChain) AddDef(ref *VarReference)

AddDef adds a definition to the chain

func (*DefUseChain) AddPair added in v1.5.0

func (c *DefUseChain) AddPair(pair *DefUsePair)

AddPair adds a def-use pair to the chain

func (*DefUseChain) AddUse added in v1.5.0

func (c *DefUseChain) AddUse(ref *VarReference)

AddUse adds a use to the chain

type DefUseKind added in v1.5.0

type DefUseKind int

DefUseKind classifies how a variable is referenced

const (
	// Definition kinds
	DefKindAssign       DefUseKind = iota // x = ...
	DefKindParameter                      // def f(x):
	DefKindForTarget                      // for x in ...
	DefKindImport                         // import x / from m import x
	DefKindWithTarget                     // with ... as x:
	DefKindExceptTarget                   // except E as x:
	DefKindAugmented                      // x += 1 (both def and use)

	// Use kinds
	UseKindRead      // ... = x (reading variable)
	UseKindCall      // f(x) (as argument)
	UseKindAttribute // x.attr (base object)
	UseKindSubscript // x[i] (base object)
)

func (DefUseKind) IsDef added in v1.5.0

func (k DefUseKind) IsDef() bool

IsDef returns true if this kind represents a definition

func (DefUseKind) IsUse added in v1.5.0

func (k DefUseKind) IsUse() bool

IsUse returns true if this kind represents a use

func (DefUseKind) String added in v1.5.0

func (k DefUseKind) String() string

String returns the string representation of DefUseKind

type DefUsePair added in v1.5.0

type DefUsePair struct {
	Def *VarReference
	Use *VarReference
}

DefUsePair links a definition to its use

func NewDefUsePair added in v1.5.0

func NewDefUsePair(def, use *VarReference) *DefUsePair

NewDefUsePair creates a new def-use pair

func (*DefUsePair) IsCrossBlock added in v1.5.0

func (p *DefUsePair) IsCrossBlock() bool

IsCrossBlock returns true if the def and use are in different blocks

type DefaultCostModel

type DefaultCostModel struct{}

DefaultCostModel implements a uniform cost model where all operations cost 1.0

func NewDefaultCostModel

func NewDefaultCostModel() *DefaultCostModel

NewDefaultCostModel creates a new default cost model

func (*DefaultCostModel) Delete

func (c *DefaultCostModel) Delete(node *TreeNode) float64

Delete returns the cost of deleting a node (always 1.0)

func (*DefaultCostModel) Insert

func (c *DefaultCostModel) Insert(node *TreeNode) float64

Insert returns the cost of inserting a node (always 1.0)

func (*DefaultCostModel) Rename

func (c *DefaultCostModel) Rename(node1, node2 *TreeNode) float64

Rename returns the cost of renaming node1 to node2

type DependencyChain

type DependencyChain struct {
	From   string   // Starting module
	To     string   // Ending module
	Path   []string // Complete dependency path
	Length int      // Length of the chain
}

DependencyChain represents a chain of dependencies

type DependencyEdge

type DependencyEdge struct {
	From       string             // Source module name
	To         string             // Target module name
	EdgeType   DependencyEdgeType // Type of dependency
	ImportInfo *ImportInfo        // Details about the import
}

DependencyEdge represents a dependency relationship between modules

type DependencyEdgeType

type DependencyEdgeType string

DependencyEdgeType represents the type of dependency relationship

const (
	DependencyEdgeImport     DependencyEdgeType = "import"      // Direct import (import module)
	DependencyEdgeFromImport DependencyEdgeType = "from_import" // From import (from module import name)
	DependencyEdgeRelative   DependencyEdgeType = "relative"    // Relative import
	DependencyEdgeImplicit   DependencyEdgeType = "implicit"    // Implicit dependency
)

type DependencyGraph

type DependencyGraph struct {
	// Graph structure
	Nodes map[string]*ModuleNode // Module name -> ModuleNode
	Edges []*DependencyEdge      // All dependency relationships

	// Graph metadata
	TotalModules int      // Total number of modules
	TotalEdges   int      // Total number of dependencies
	RootModules  []string // Modules with no dependencies
	LeafModules  []string // Modules with no dependents
	ProjectRoot  string   // Project root directory

	// Analysis results
	CyclicGroups  [][]string                // Strongly connected components (cycles)
	ModuleMetrics map[string]*ModuleMetrics // Module-level metrics
	SystemMetrics *SystemMetrics            // System-wide metrics
}

DependencyGraph represents the complete module dependency graph

func NewDependencyGraph

func NewDependencyGraph(projectRoot string) *DependencyGraph

NewDependencyGraph creates a new dependency graph

func (*DependencyGraph) AddDependency

func (g *DependencyGraph) AddDependency(from, to string, edgeType DependencyEdgeType, importInfo *ImportInfo)

AddDependency adds a dependency edge between two modules

func (*DependencyGraph) AddModule

func (g *DependencyGraph) AddModule(moduleName, filePath string) *ModuleNode

AddModule adds a module to the graph

func (*DependencyGraph) Clone

func (g *DependencyGraph) Clone() *DependencyGraph

Clone creates a deep copy of the dependency graph

func (*DependencyGraph) GetDependencies

func (g *DependencyGraph) GetDependencies(moduleName string) []string

GetDependencies returns all modules that the given module depends on

func (*DependencyGraph) GetDependencyChain

func (g *DependencyGraph) GetDependencyChain(from, to string) []string

GetDependencyChain finds the dependency path between two modules

func (*DependencyGraph) GetDependents

func (g *DependencyGraph) GetDependents(moduleName string) []string

GetDependents returns all modules that depend on the given module

func (*DependencyGraph) GetLeafModules

func (g *DependencyGraph) GetLeafModules() []string

GetLeafModules returns modules with no dependents (utilities)

func (*DependencyGraph) GetModule

func (g *DependencyGraph) GetModule(moduleName string) *ModuleNode

GetModule retrieves a module node by name

func (*DependencyGraph) GetModuleNames

func (g *DependencyGraph) GetModuleNames() []string

GetModuleNames returns all module names in the graph

func (*DependencyGraph) GetModulesInCycles

func (g *DependencyGraph) GetModulesInCycles() []string

GetModulesInCycles returns all modules that are part of dependency cycles

func (*DependencyGraph) GetPackages

func (g *DependencyGraph) GetPackages() []string

GetPackages returns all unique package names

func (*DependencyGraph) GetRootModules

func (g *DependencyGraph) GetRootModules() []string

GetRootModules returns modules with no dependencies (entry points)

func (*DependencyGraph) HasCycle

func (g *DependencyGraph) HasCycle() bool

HasCycle checks if the graph contains any cycles

func (*DependencyGraph) String

func (g *DependencyGraph) String() string

String returns a string representation of the graph

func (*DependencyGraph) Validate

func (g *DependencyGraph) Validate() error

Validate checks the graph for consistency

type Edge

type Edge struct {
	From *BasicBlock
	To   *BasicBlock
	Type EdgeType
}

Edge represents a directed edge between two basic blocks

type EdgeType

type EdgeType int

EdgeType represents the type of edge between basic blocks

const (
	// EdgeNormal represents normal sequential flow
	EdgeNormal EdgeType = iota
	// EdgeCondTrue represents conditional true branch
	EdgeCondTrue
	// EdgeCondFalse represents conditional false branch
	EdgeCondFalse
	// EdgeException represents exception flow
	EdgeException
	// EdgeLoop represents loop back edge
	EdgeLoop
	// EdgeBreak represents break statement flow
	EdgeBreak
	// EdgeContinue represents continue statement flow
	EdgeContinue
	// EdgeReturn represents return statement flow
	EdgeReturn
)

func (EdgeType) String

func (e EdgeType) String() string

String returns string representation of EdgeType

type FeatureExtractor

type FeatureExtractor interface {
	ExtractFeatures(ast *TreeNode) ([]string, error)
	ExtractSubtreeHashes(ast *TreeNode, maxHeight int) []string
	ExtractNodeSequences(ast *TreeNode, k int) []string
}

FeatureExtractor converts AST trees into feature sets for Jaccard similarity

type FileComplexityAnalyzer

type FileComplexityAnalyzer struct {
	// contains filtered or unexported fields
}

FileComplexityAnalyzer provides high-level file analysis capabilities

func NewFileComplexityAnalyzer

func NewFileComplexityAnalyzer(cfg *config.Config, output io.Writer) (*FileComplexityAnalyzer, error)

NewFileComplexityAnalyzer creates a new file analyzer with configuration

func (*FileComplexityAnalyzer) AnalyzeFile

func (fca *FileComplexityAnalyzer) AnalyzeFile(filename string) error

AnalyzeFile analyzes a single Python file and outputs complexity results

func (*FileComplexityAnalyzer) AnalyzeFiles

func (fca *FileComplexityAnalyzer) AnalyzeFiles(filenames []string) error

AnalyzeFiles analyzes multiple Python files and outputs combined complexity results

type GroupingConfig

type GroupingConfig struct {
	Mode           GroupingMode
	Threshold      float64 // Minimum similarity for group membership
	KCoreK         int     // K value for k-core mode (default: 2)
	Type1Threshold float64 // Type-1 clone threshold
	Type2Threshold float64 // Type-2 clone threshold
	Type3Threshold float64 // Type-3 clone threshold
	Type4Threshold float64 // Type-4 clone threshold
}

GroupingConfig holds configuration for grouping strategies

type GroupingMode

type GroupingMode string

GroupingMode represents the strategy for grouping clones

const (
	GroupingModeConnected       GroupingMode = "connected"        // 現在のデフォルト(高再現率)
	GroupingModeStar            GroupingMode = "star"             // Star/Medoid(バランス型)
	GroupingModeCompleteLinkage GroupingMode = "complete_linkage" // 完全連結(高精度)
	GroupingModeKCore           GroupingMode = "k_core"           // k-core制約(スケーラブル)
	GroupingModeCentroid        GroupingMode = "centroid"         // 重心ベース(推移的問題を回避)
)

type GroupingStrategy

type GroupingStrategy interface {
	// GroupClones groups the given clone pairs into clone groups.
	GroupClones(pairs []*ClonePair) []*CloneGroup
	// GetName returns the strategy name.
	GetName() string
}

GroupingStrategy defines a strategy for grouping clone pairs into clone groups. Implementations should avoid recomputing similarities and work with provided pairs.

func CreateGroupingStrategy

func CreateGroupingStrategy(config GroupingConfig) GroupingStrategy

CreateGroupingStrategy creates a strategy based on mode and config

type HashFunc

type HashFunc func(uint64) uint64

HashFunc maps a 64-bit base hash to another 64-bit value

type HiddenDependencyDetector added in v1.16.0

type HiddenDependencyDetector struct {
	// contains filtered or unexported fields
}

HiddenDependencyDetector detects hidden dependency anti-patterns

func NewHiddenDependencyDetector added in v1.16.0

func NewHiddenDependencyDetector() *HiddenDependencyDetector

NewHiddenDependencyDetector creates a new hidden dependency detector

func (*HiddenDependencyDetector) Analyze added in v1.16.0

Analyze detects hidden dependencies in the given AST

type ImportInfo

type ImportInfo struct {
	Statement      string   // Original import statement
	ImportedNames  []string // Names imported (for from imports)
	Alias          string   // Alias used (if any)
	IsRelative     bool     // True for relative imports
	Level          int      // Level for relative imports (number of dots)
	Line           int      // Line number where import occurs
	IsTypeChecking bool     // True if import is inside a TYPE_CHECKING block
}

ImportInfo contains details about an import statement

type KCoreGrouping

type KCoreGrouping struct {
	// contains filtered or unexported fields
}

KCoreGrouping ensures each fragment has at least k similar neighbors

func NewKCoreGrouping

func NewKCoreGrouping(threshold float64, k int) *KCoreGrouping

func (*KCoreGrouping) GetName

func (k *KCoreGrouping) GetName() string

func (*KCoreGrouping) GroupClones

func (k *KCoreGrouping) GroupClones(pairs []*ClonePair) []*CloneGroup

type LCOMAnalyzer added in v1.11.0

type LCOMAnalyzer struct {
	// contains filtered or unexported fields
}

LCOMAnalyzer analyzes class cohesion in Python code

func NewLCOMAnalyzer added in v1.11.0

func NewLCOMAnalyzer(options *LCOMOptions) *LCOMAnalyzer

NewLCOMAnalyzer creates a new LCOM analyzer

func (*LCOMAnalyzer) AnalyzeClasses added in v1.11.0

func (a *LCOMAnalyzer) AnalyzeClasses(ast *parser.Node, filePath string) ([]*LCOMResult, error)

AnalyzeClasses analyzes LCOM4 for all classes in the given AST

type LCOMOptions added in v1.11.0

type LCOMOptions struct {
	LowThreshold    int // Default: 2 (LCOM4 <= 2 is low risk)
	MediumThreshold int // Default: 5 (LCOM4 3-5 is medium risk)
	ExcludePatterns []string
}

LCOMOptions configures LCOM analysis behavior

func DefaultLCOMOptions added in v1.11.0

func DefaultLCOMOptions() *LCOMOptions

DefaultLCOMOptions returns default LCOM analysis options

type LCOMResult added in v1.11.0

type LCOMResult struct {
	// Core LCOM4 metric - number of connected components
	LCOM4 int

	// Class information
	ClassName string
	FilePath  string
	StartLine int
	EndLine   int

	// Method statistics
	TotalMethods    int // All methods found in class
	ExcludedMethods int // @staticmethod and @classmethod excluded

	// Instance variable count
	InstanceVariables int // Distinct self.xxx variables

	// Connected component details
	MethodGroups [][]string // Method names grouped by connected component

	// Risk assessment
	RiskLevel string // "low", "medium", "high"
}

LCOMResult holds LCOM4 (Lack of Cohesion of Methods) metrics for a class

func CalculateLCOM added in v1.11.0

func CalculateLCOM(ast *parser.Node, filePath string) ([]*LCOMResult, error)

CalculateLCOM is a convenience function that creates an analyzer with defaults and runs it

func CalculateLCOMWithConfig added in v1.11.0

func CalculateLCOMWithConfig(ast *parser.Node, filePath string, options *LCOMOptions) ([]*LCOMResult, error)

CalculateLCOMWithConfig creates an analyzer with custom options and runs it

type LSHIndex

type LSHIndex struct {
	// contains filtered or unexported fields
}

LSHIndex implements MinHash LSH with banding

func NewLSHIndex

func NewLSHIndex(bands, rows int) *LSHIndex

NewLSHIndex creates an index with banding parameters

func (*LSHIndex) AddFragment

func (idx *LSHIndex) AddFragment(id string, signature *MinHashSignature) error

AddFragment inserts a fragment signature into the index

func (*LSHIndex) BuildIndex

func (idx *LSHIndex) BuildIndex() error

BuildIndex is a no-op for incremental building (kept for API symmetry)

func (*LSHIndex) FindCandidates

func (idx *LSHIndex) FindCandidates(signature *MinHashSignature) []string

FindCandidates retrieves candidate fragment IDs that share at least one band bucket

type MinHashSignature

type MinHashSignature struct {
	// contains filtered or unexported fields
}

MinHashSignature holds the signature vector

type MinHasher

type MinHasher struct {
	// contains filtered or unexported fields
}

MinHasher computes MinHash signatures for feature sets

func NewMinHasher

func NewMinHasher(numHashes int) *MinHasher

NewMinHasher creates a MinHasher with numHashes functions (default 128 if invalid)

func (*MinHasher) ComputeSignature

func (m *MinHasher) ComputeSignature(features []string) *MinHashSignature

ComputeSignature computes the MinHash signature for a set of features

func (*MinHasher) EstimateJaccardSimilarity

func (m *MinHasher) EstimateJaccardSimilarity(sig1, sig2 *MinHashSignature) float64

EstimateJaccardSimilarity estimates Jaccard similarity via signature agreement ratio

func (*MinHasher) NumHashes

func (m *MinHasher) NumHashes() int

type ModuleAnalysisOptions

type ModuleAnalysisOptions struct {
	ProjectRoot       string   // Project root directory
	PythonPath        []string // Additional Python path entries
	ExcludePatterns   []string // Module patterns to exclude
	IncludePatterns   []string // Module patterns to include (default: ["**/*.py"])
	IncludeStdLib     bool     // Include standard library dependencies
	IncludeThirdParty bool     // Include third-party dependencies
	FollowRelative    bool     // Follow relative imports
}

ModuleAnalysisOptions configures module analysis behavior

func DefaultModuleAnalysisOptions

func DefaultModuleAnalysisOptions() *ModuleAnalysisOptions

DefaultModuleAnalysisOptions returns default analysis options

type ModuleAnalyzer

type ModuleAnalyzer struct {
	// contains filtered or unexported fields
}

ModuleAnalyzer analyzes module-level dependencies and builds dependency graphs

func NewModuleAnalyzer

func NewModuleAnalyzer(options *ModuleAnalysisOptions) (*ModuleAnalyzer, error)

NewModuleAnalyzer creates a new module analyzer

func (*ModuleAnalyzer) AnalyzeFiles

func (ma *ModuleAnalyzer) AnalyzeFiles(filePaths []string) (*DependencyGraph, error)

AnalyzeFiles analyzes specific Python files and builds a dependency graph

func (*ModuleAnalyzer) AnalyzeProject

func (ma *ModuleAnalyzer) AnalyzeProject() (*DependencyGraph, error)

AnalyzeProject analyzes all Python modules in the project and builds a dependency graph

type ModuleMetrics

type ModuleMetrics struct {
	// Coupling metrics
	AfferentCoupling int     // Ca - Number of modules that depend on this module
	EfferentCoupling int     // Ce - Number of modules this module depends on
	Instability      float64 // I = Ce / (Ca + Ce) - Measure of instability
	Abstractness     float64 // A - Measure of abstractness (0-1)
	Distance         float64 // D - Distance from main sequence

	// Size metrics
	LinesOfCode     int // Total lines of code
	PublicInterface int // Number of public functions/classes

	// Quality metrics
	CyclomaticComplexity int // Average complexity of functions
}

ModuleMetrics contains metrics for a single module

type ModuleNode

type ModuleNode struct {
	// Module identification
	Name         string // Module name (e.g., "mypackage.submodule")
	FilePath     string // Absolute path to the Python file
	RelativePath string // Relative path from project root
	Package      string // Package name (e.g., "mypackage")
	IsPackage    bool   // True if this represents a package (__init__.py)

	// Dependencies
	Imports      []string        // Direct imports from this module
	ImportedBy   []string        // Modules that import this module
	Dependencies map[string]bool // Set of modules this module depends on
	Dependents   map[string]bool // Set of modules that depend on this module

	// Metrics
	InDegree  int // Number of incoming dependencies
	OutDegree int // Number of outgoing dependencies

	// Module information
	LineCount     int      // Total lines in the module
	FunctionCount int      // Number of functions defined
	ClassCount    int      // Number of classes defined
	PublicNames   []string // Public names exported by this module
}

ModuleNode represents a module in the dependency graph

type NestingDepthResult added in v1.0.2

type NestingDepthResult struct {
	// Maximum nesting depth found in the function
	MaxDepth int

	// Function/method information
	FunctionName string
	StartLine    int
	EndLine      int

	// Location of deepest nesting (line number)
	DeepestNestingLine int
}

NestingDepthResult holds the maximum nesting depth and related metadata for a function

func CalculateMaxNestingDepth added in v1.0.2

func CalculateMaxNestingDepth(funcNode *parser.Node) *NestingDepthResult

CalculateMaxNestingDepth computes the maximum nesting depth for a function node by traversing its AST and tracking depth through nested control structures

type OptimizedAPTEDAnalyzer

type OptimizedAPTEDAnalyzer struct {
	*APTEDAnalyzer
	// contains filtered or unexported fields
}

OptimizedAPTEDAnalyzer extends APTEDAnalyzer with performance optimizations

func NewOptimizedAPTEDAnalyzer

func NewOptimizedAPTEDAnalyzer(costModel CostModel, maxDistance float64) *OptimizedAPTEDAnalyzer

NewOptimizedAPTEDAnalyzer creates an optimized APTED analyzer

func (*OptimizedAPTEDAnalyzer) ComputeDistance

func (a *OptimizedAPTEDAnalyzer) ComputeDistance(tree1, tree2 *TreeNode) float64

ComputeDistance computes tree edit distance with early stopping optimization

type PythonCostModel

type PythonCostModel struct {
	// Base costs for different operations
	BaseInsertCost float64
	BaseDeleteCost float64
	BaseRenameCost float64

	// Whether to ignore differences in literal values
	IgnoreLiterals bool

	// Whether to ignore differences in identifier names
	IgnoreIdentifiers bool

	// Whether to reduce weight for boilerplate nodes (type annotations, decorators, Field() calls)
	ReduceBoilerplateWeight bool

	// Multiplier for boilerplate nodes (default: 0.1)
	BoilerplateMultiplier float64
}

PythonCostModel implements a Python-aware cost model with different costs for different node types

func NewPythonCostModel

func NewPythonCostModel() *PythonCostModel

NewPythonCostModel creates a new Python-aware cost model with default settings

func NewPythonCostModelWithBoilerplateConfig added in v1.9.2

func NewPythonCostModelWithBoilerplateConfig(ignoreLiterals, ignoreIdentifiers, reduceBoilerplate bool, boilerplateMultiplier float64) *PythonCostModel

NewPythonCostModelWithBoilerplateConfig creates a Python cost model with full configuration

func NewPythonCostModelWithConfig

func NewPythonCostModelWithConfig(ignoreLiterals, ignoreIdentifiers bool) *PythonCostModel

NewPythonCostModelWithConfig creates a Python cost model with custom configuration

func (*PythonCostModel) Delete

func (c *PythonCostModel) Delete(node *TreeNode) float64

Delete returns the cost of deleting a node

func (*PythonCostModel) Insert

func (c *PythonCostModel) Insert(node *TreeNode) float64

Insert returns the cost of inserting a node

func (*PythonCostModel) Rename

func (c *PythonCostModel) Rename(node1, node2 *TreeNode) float64

Rename returns the cost of renaming node1 to node2

type RawMetricsResult added in v1.15.0

type RawMetricsResult struct {
	FilePath       string
	SLOC           int
	LLOC           int
	CommentLines   int
	DocstringLines int
	BlankLines     int
	TotalLines     int
	CommentRatio   float64
	// contains filtered or unexported fields
}

RawMetricsResult contains file-level raw code metrics.

func CalculateRawMetrics added in v1.15.0

func CalculateRawMetrics(content []byte, filePath string) *RawMetricsResult

CalculateRawMetrics calculates raw code metrics without requiring AST parsing.

type ReExportEntry added in v1.9.3

type ReExportEntry struct {
	Name         string // The exported name (e.g., "SomeClass")
	SourceModule string // The actual source module (e.g., "pkg_a.module_x")
	SourceName   string // The name in the source module (may differ if aliased)
}

ReExportEntry represents a single re-exported name from an __init__.py

type ReExportMap added in v1.9.3

type ReExportMap struct {
	PackageName string                    // The package name (e.g., "pkg_a")
	Exports     map[string]*ReExportEntry // name -> source info
	AllDeclared []string                  // Names in __all__ if declared
	HasAllDecl  bool                      // True if __all__ is explicitly declared
}

ReExportMap holds all exports from a package's __init__.py

type ReExportResolver added in v1.9.3

type ReExportResolver struct {
	// contains filtered or unexported fields
}

ReExportResolver resolves re-exports in __init__.py files

func NewReExportResolver added in v1.9.3

func NewReExportResolver(projectRoot string) *ReExportResolver

NewReExportResolver creates a new resolver

func (*ReExportResolver) GetReExportMap added in v1.9.3

func (r *ReExportResolver) GetReExportMap(packageName string) (*ReExportMap, error)

GetReExportMap returns the re-export map for a package (cached)

func (*ReExportResolver) ResolveReExport added in v1.9.3

func (r *ReExportResolver) ResolveReExport(packageName, importedName string) (string, bool)

ResolveReExport resolves an imported name to its actual source module. Returns (sourceModule, found).

Parse errors are treated as "no re-exports found" - if the __init__.py cannot be parsed, we fall back to using the package as the dependency target. This is intentional: syntax errors in __init__.py shouldn't break dependency analysis, and the error is cached to avoid repeated parse attempts.

type ReachabilityAnalyzer

type ReachabilityAnalyzer struct {
	// contains filtered or unexported fields
}

ReachabilityAnalyzer performs reachability analysis on CFGs

func NewReachabilityAnalyzer

func NewReachabilityAnalyzer(cfg *CFG) *ReachabilityAnalyzer

NewReachabilityAnalyzer creates a new reachability analyzer for the given CFG

func (*ReachabilityAnalyzer) AnalyzeReachability

func (ra *ReachabilityAnalyzer) AnalyzeReachability() *ReachabilityResult

AnalyzeReachability performs reachability analysis starting from the entry block

func (*ReachabilityAnalyzer) AnalyzeReachabilityFrom

func (ra *ReachabilityAnalyzer) AnalyzeReachabilityFrom(startBlock *BasicBlock) *ReachabilityResult

AnalyzeReachabilityFrom performs reachability analysis from a specific starting block

type ReachabilityResult

type ReachabilityResult struct {
	// ReachableBlocks contains blocks that can be reached from entry
	ReachableBlocks map[string]*BasicBlock

	// UnreachableBlocks contains blocks that cannot be reached from entry
	UnreachableBlocks map[string]*BasicBlock

	// TotalBlocks is the total number of blocks analyzed
	TotalBlocks int

	// ReachableCount is the number of reachable blocks
	ReachableCount int

	// UnreachableCount is the number of unreachable blocks
	UnreachableCount int

	// AnalysisTime is the time taken to perform the analysis
	AnalysisTime time.Duration
}

ReachabilityResult contains the results of reachability analysis

func (*ReachabilityResult) GetReachabilityRatio

func (result *ReachabilityResult) GetReachabilityRatio() float64

GetReachabilityRatio returns the ratio of reachable blocks to total blocks

func (*ReachabilityResult) GetUnreachableBlocksWithStatements

func (result *ReachabilityResult) GetUnreachableBlocksWithStatements() map[string]*BasicBlock

GetUnreachableBlocksWithStatements returns unreachable blocks that contain statements

func (*ReachabilityResult) HasUnreachableCode

func (result *ReachabilityResult) HasUnreachableCode() bool

HasUnreachableCode returns true if there are unreachable blocks with statements

type SemanticSimilarityAnalyzer added in v1.5.0

type SemanticSimilarityAnalyzer struct {
	// contains filtered or unexported fields
}

SemanticSimilarityAnalyzer computes semantic similarity using CFG (Control Flow Graph) and optionally DFA (Data Flow Analysis) feature comparison. This is used for Type-4 clone detection (functionally similar code with different syntax).

func NewSemanticSimilarityAnalyzer added in v1.5.0

func NewSemanticSimilarityAnalyzer() *SemanticSimilarityAnalyzer

NewSemanticSimilarityAnalyzer creates a new semantic similarity analyzer

func NewSemanticSimilarityAnalyzerWithDFA added in v1.5.0

func NewSemanticSimilarityAnalyzerWithDFA() *SemanticSimilarityAnalyzer

NewSemanticSimilarityAnalyzerWithDFA creates a new analyzer with DFA enabled

func (*SemanticSimilarityAnalyzer) BuildCFG added in v1.5.0

func (s *SemanticSimilarityAnalyzer) BuildCFG(node *parser.Node) (*CFG, error)

BuildCFG builds a CFG from a parser.Node (exposed for testing)

func (*SemanticSimilarityAnalyzer) BuildDFA added in v1.5.0

func (s *SemanticSimilarityAnalyzer) BuildDFA(cfg *CFG) (*DFAInfo, error)

BuildDFA builds DFA info from a CFG (exposed for testing)

func (*SemanticSimilarityAnalyzer) ComputeSimilarity added in v1.5.0

func (s *SemanticSimilarityAnalyzer) ComputeSimilarity(f1, f2 *CodeFragment) float64

ComputeSimilarity computes the semantic similarity between two code fragments by comparing their CFG structures and optionally DFA features.

func (*SemanticSimilarityAnalyzer) ExtractDFAFeaturesFromInfo added in v1.5.0

func (s *SemanticSimilarityAnalyzer) ExtractDFAFeaturesFromInfo(info *DFAInfo) *DFAFeatures

ExtractDFAFeatures extracts DFA features from DFA info (exposed for testing)

func (*SemanticSimilarityAnalyzer) ExtractFeatures added in v1.5.0

func (s *SemanticSimilarityAnalyzer) ExtractFeatures(cfg *CFG) *CFGFeatures

ExtractFeatures extracts CFG features (exposed for testing)

func (*SemanticSimilarityAnalyzer) GetName added in v1.5.0

func (s *SemanticSimilarityAnalyzer) GetName() string

GetName returns the name of this analyzer

func (*SemanticSimilarityAnalyzer) IsDFAEnabled added in v1.5.0

func (s *SemanticSimilarityAnalyzer) IsDFAEnabled() bool

IsDFAEnabled returns whether DFA analysis is enabled

func (*SemanticSimilarityAnalyzer) SetEnableDFA added in v1.5.0

func (s *SemanticSimilarityAnalyzer) SetEnableDFA(enable bool)

SetEnableDFA enables or disables DFA analysis

func (*SemanticSimilarityAnalyzer) SetWeights added in v1.5.0

func (s *SemanticSimilarityAnalyzer) SetWeights(cfgWeight, dfaWeight float64)

SetWeights sets the CFG and DFA feature weights

type ServiceLocatorDetector added in v1.16.0

type ServiceLocatorDetector struct {
	// contains filtered or unexported fields
}

ServiceLocatorDetector detects service locator anti-pattern

func NewServiceLocatorDetector added in v1.16.0

func NewServiceLocatorDetector() *ServiceLocatorDetector

NewServiceLocatorDetector creates a new service locator detector

func (*ServiceLocatorDetector) Analyze added in v1.16.0

func (d *ServiceLocatorDetector) Analyze(ast *parser.Node, filePath string) []domain.DIAntipatternFinding

Analyze detects service locator pattern in the given AST

type SeverityLevel

type SeverityLevel string

SeverityLevel represents the severity of a dead code finding

const (
	// SeverityLevelCritical indicates code that is definitely unreachable
	SeverityLevelCritical SeverityLevel = "critical"

	// SeverityLevelWarning indicates code that is likely unreachable
	SeverityLevelWarning SeverityLevel = "warning"

	// SeverityLevelInfo indicates potential optimization opportunities
	SeverityLevelInfo SeverityLevel = "info"
)

type SimilarityAnalyzer added in v1.5.0

type SimilarityAnalyzer interface {
	// ComputeSimilarity returns a similarity score between 0.0 and 1.0
	ComputeSimilarity(fragment1, fragment2 *CodeFragment) float64

	// GetName returns the name of this analyzer
	GetName() string
}

SimilarityAnalyzer defines the interface for computing similarity between code fragments. Each clone type should have its own analyzer implementation.

type StarMedoidGrouping

type StarMedoidGrouping struct {
	// contains filtered or unexported fields
}

StarMedoidGrouping groups fragments by iteratively selecting medoids and reassigning members. It uses provided pair similarities only and does not recompute.

func NewStarMedoidGrouping

func NewStarMedoidGrouping(threshold float64) *StarMedoidGrouping

NewStarMedoidGrouping creates a new Star/Medoid grouping with a similarity threshold. Default: maxIterations=10, early-stop after 3 consecutive no-change iterations.

func (*StarMedoidGrouping) GetName

func (s *StarMedoidGrouping) GetName() string

func (*StarMedoidGrouping) GroupClones

func (s *StarMedoidGrouping) GroupClones(pairs []*ClonePair) []*CloneGroup

GroupClones groups clone pairs using a star/medoid strategy.

type StructuralSimilarityAnalyzer added in v1.5.0

type StructuralSimilarityAnalyzer struct {
	// contains filtered or unexported fields
}

StructuralSimilarityAnalyzer computes structural similarity using APTED tree edit distance. This is used for Type-3 clone detection (near-miss clones with modifications).

func NewStructuralSimilarityAnalyzer added in v1.5.0

func NewStructuralSimilarityAnalyzer() *StructuralSimilarityAnalyzer

NewStructuralSimilarityAnalyzer creates a new structural similarity analyzer using the standard Python cost model (no normalization).

func NewStructuralSimilarityAnalyzerWithCostModel added in v1.5.0

func NewStructuralSimilarityAnalyzerWithCostModel(costModel CostModel) *StructuralSimilarityAnalyzer

NewStructuralSimilarityAnalyzerWithCostModel creates a structural similarity analyzer with a custom cost model.

func (*StructuralSimilarityAnalyzer) ComputeDistance added in v1.5.0

func (s *StructuralSimilarityAnalyzer) ComputeDistance(f1, f2 *CodeFragment) float64

ComputeDistance computes the edit distance between two code fragments. This is useful for additional metrics beyond similarity.

func (*StructuralSimilarityAnalyzer) ComputeSimilarity added in v1.5.0

func (s *StructuralSimilarityAnalyzer) ComputeSimilarity(f1, f2 *CodeFragment) float64

ComputeSimilarity computes the structural similarity between two code fragments using APTED tree edit distance.

func (*StructuralSimilarityAnalyzer) GetAnalyzer added in v1.5.0

func (s *StructuralSimilarityAnalyzer) GetAnalyzer() *APTEDAnalyzer

GetAnalyzer returns the underlying APTED analyzer (for advanced usage)

func (*StructuralSimilarityAnalyzer) GetName added in v1.5.0

func (s *StructuralSimilarityAnalyzer) GetName() string

GetName returns the name of this analyzer

type SyntacticSimilarityAnalyzer added in v1.5.0

type SyntacticSimilarityAnalyzer struct {
	// contains filtered or unexported fields
}

SyntacticSimilarityAnalyzer computes syntactic similarity using normalized AST hash comparison with Jaccard coefficient. This is used for Type-2 clone detection (syntactically identical but with different identifiers/literals).

Unlike the previous APTED-based approach which measured tree edit distance, this implementation compares sets of normalized node hashes. This eliminates false positives from structurally similar but semantically different code, as only nodes with identical normalized structure contribute to similarity.

func NewSyntacticSimilarityAnalyzer added in v1.5.0

func NewSyntacticSimilarityAnalyzer() *SyntacticSimilarityAnalyzer

NewSyntacticSimilarityAnalyzer creates a new syntactic similarity analyzer using normalized AST hash comparison that ignores identifier and literal differences.

func NewSyntacticSimilarityAnalyzerWithOptions added in v1.5.0

func NewSyntacticSimilarityAnalyzerWithOptions(ignoreLiterals, ignoreIdentifiers bool) *SyntacticSimilarityAnalyzer

NewSyntacticSimilarityAnalyzerWithOptions creates a syntactic similarity analyzer with configurable normalization options.

func (*SyntacticSimilarityAnalyzer) ComputeDistance added in v1.5.0

func (s *SyntacticSimilarityAnalyzer) ComputeDistance(f1, f2 *CodeFragment) float64

ComputeDistance computes the syntactic distance between two code fragments. Returns 1 - similarity, so distance ranges from 0 (identical) to 1 (completely different). Returns 0.0 for nil inputs (no distance can be computed).

func (*SyntacticSimilarityAnalyzer) ComputeSimilarity added in v1.5.0

func (s *SyntacticSimilarityAnalyzer) ComputeSimilarity(f1, f2 *CodeFragment) float64

ComputeSimilarity computes the syntactic similarity between two code fragments using Jaccard coefficient of normalized AST hash sets. It ignores differences in identifier names and literal values, focusing only on the structural syntax pattern.

func (*SyntacticSimilarityAnalyzer) GetExtractor added in v1.9.0

GetExtractor returns the underlying feature extractor (for advanced usage)

func (*SyntacticSimilarityAnalyzer) GetName added in v1.5.0

func (s *SyntacticSimilarityAnalyzer) GetName() string

GetName returns the name of this analyzer

type SystemMetrics

type SystemMetrics struct {
	// Overall structure
	TotalModules      int // Total number of modules
	TotalDependencies int // Total number of dependencies
	PackageCount      int // Number of packages

	// Dependency metrics
	AverageFanIn    float64 // Average number of incoming dependencies
	AverageFanOut   float64 // Average number of outgoing dependencies
	DependencyRatio float64 // Total dependencies / Total modules

	// Coupling and cohesion
	AverageInstability    float64 // System average instability
	AverageAbstractness   float64 // System average abstractness
	MainSequenceDeviation float64 // Average distance from main sequence

	// Modularity
	ModularityIndex float64 // Measure of system decomposition quality
	ComponentRatio  float64 // Ratio of strongly connected components

	// Quality indicators
	CyclicDependencies int     // Number of modules in cycles
	MaxDependencyDepth int     // Maximum dependency chain length
	SystemComplexity   float64 // Overall system complexity score

	// Refactoring
	RefactoringPriority []string // Modules needing refactoring (highest priority first)
}

SystemMetrics contains system-wide quality metrics

type TextualSimilarityAnalyzer added in v1.5.0

type TextualSimilarityAnalyzer struct {
	// contains filtered or unexported fields
}

TextualSimilarityAnalyzer computes textual similarity for Type-1 clone detection. Type-1 clones are identical code fragments except for whitespace and comments.

func NewTextualSimilarityAnalyzer added in v1.5.0

func NewTextualSimilarityAnalyzer() *TextualSimilarityAnalyzer

NewTextualSimilarityAnalyzer creates a new textual similarity analyzer with default settings (normalize whitespace and remove comments).

func NewTextualSimilarityAnalyzerWithConfig added in v1.5.0

func NewTextualSimilarityAnalyzerWithConfig(config *TextualSimilarityConfig) *TextualSimilarityAnalyzer

NewTextualSimilarityAnalyzerWithConfig creates a textual similarity analyzer with custom configuration.

func (*TextualSimilarityAnalyzer) ComputeSimilarity added in v1.5.0

func (t *TextualSimilarityAnalyzer) ComputeSimilarity(f1, f2 *CodeFragment) float64

ComputeSimilarity computes the textual similarity between two code fragments. Returns 1.0 for identical content (after normalization), or a Levenshtein-based similarity score for near-matches.

func (*TextualSimilarityAnalyzer) GetName added in v1.5.0

func (t *TextualSimilarityAnalyzer) GetName() string

GetName returns the name of this analyzer

type TextualSimilarityConfig added in v1.5.0

type TextualSimilarityConfig struct {
	NormalizeWhitespace bool
	RemoveComments      bool
}

TextualSimilarityConfig holds configuration for textual similarity analysis

type TreeConverter

type TreeConverter struct {
	// contains filtered or unexported fields
}

TreeConverter converts parser AST nodes to APTED tree nodes

func NewTreeConverter

func NewTreeConverter() *TreeConverter

NewTreeConverter creates a new tree converter with default settings (no docstring skipping)

func NewTreeConverterWithConfig added in v1.5.2

func NewTreeConverterWithConfig(skipDocstrings bool) *TreeConverter

NewTreeConverterWithConfig creates a tree converter with configuration

func (*TreeConverter) ConvertAST

func (tc *TreeConverter) ConvertAST(astNode *parser.Node) *TreeNode

ConvertAST converts a parser AST node to an APTED tree

type TreeEditResult

type TreeEditResult struct {
	Distance   float64
	Similarity float64
	Tree1Size  int
	Tree2Size  int
	Operations int // Estimated number of edit operations
}

TreeEditResult holds the result of tree edit distance computation

type TreeNode

type TreeNode struct {
	// Unique identifier for this node
	ID int

	// Label for the node (typically the node type or value)
	Label string

	// Tree structure
	Children []*TreeNode
	Parent   *TreeNode

	// APTED-specific fields for optimization
	PostOrderID  int  // Post-order traversal position
	LeftMostLeaf int  // Left-most leaf descendant
	KeyRoot      bool // Whether this node is a key root

	// Optional metadata from original AST
	OriginalNode *parser.Node
}

TreeNode represents a node in the ordered tree for APTED algorithm

func GetNodeByPostOrderID

func GetNodeByPostOrderID(root *TreeNode, postOrderID int) *TreeNode

GetNodeByPostOrderID finds a node by its post-order ID

func GetSubtreeNodes

func GetSubtreeNodes(root *TreeNode) []*TreeNode

GetSubtreeNodes returns all nodes in the subtree rooted at the given node

func GetSubtreeNodesWithDepthLimit

func GetSubtreeNodesWithDepthLimit(root *TreeNode, maxDepth int) []*TreeNode

GetSubtreeNodesWithDepthLimit returns all nodes with maximum recursion depth limit

func NewTreeNode

func NewTreeNode(id int, label string) *TreeNode

NewTreeNode creates a new tree node with the given ID and label

func (*TreeNode) AddChild

func (t *TreeNode) AddChild(child *TreeNode)

AddChild adds a child node to this node

func (*TreeNode) Height

func (t *TreeNode) Height() int

Height returns the height of the subtree rooted at this node

func (*TreeNode) HeightWithDepthLimit

func (t *TreeNode) HeightWithDepthLimit(maxDepth int) int

HeightWithDepthLimit returns the height with maximum recursion depth limit

func (*TreeNode) IsLeaf

func (t *TreeNode) IsLeaf() bool

IsLeaf returns true if this node has no children

func (*TreeNode) Size

func (t *TreeNode) Size() int

Size returns the size of the subtree rooted at this node

func (*TreeNode) SizeWithDepthLimit

func (t *TreeNode) SizeWithDepthLimit(maxDepth int) int

SizeWithDepthLimit returns the size with maximum recursion depth limit

func (*TreeNode) String

func (t *TreeNode) String() string

String returns a string representation of the node

type VarReference added in v1.5.0

type VarReference struct {
	Name      string       // Variable name
	Kind      DefUseKind   // Type of reference
	Block     *BasicBlock  // Which block contains this reference
	Statement *parser.Node // The AST statement containing the reference
	Position  int          // Position within block's Statements slice
}

VarReference represents a single definition or use of a variable

func NewVarReference added in v1.5.0

func NewVarReference(name string, kind DefUseKind, block *BasicBlock, stmt *parser.Node, pos int) *VarReference

NewVarReference creates a new variable reference

type WeightedCostModel

type WeightedCostModel struct {
	InsertWeight  float64
	DeleteWeight  float64
	RenameWeight  float64
	BaseCostModel CostModel
}

WeightedCostModel allows custom weights for different operation types

func NewWeightedCostModel

func NewWeightedCostModel(insertWeight, deleteWeight, renameWeight float64, baseCostModel CostModel) *WeightedCostModel

NewWeightedCostModel creates a new weighted cost model

func (*WeightedCostModel) Delete

func (c *WeightedCostModel) Delete(node *TreeNode) float64

Delete returns the weighted cost of deleting a node

func (*WeightedCostModel) Insert

func (c *WeightedCostModel) Insert(node *TreeNode) float64

Insert returns the weighted cost of inserting a node

func (*WeightedCostModel) Rename

func (c *WeightedCostModel) Rename(node1, node2 *TreeNode) float64

Rename returns the weighted cost of renaming node1 to node2

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL