Documentation
¶
Overview ¶
Package clones provides clone detection analysis using MinHash and LSH.
Index ¶
- Constants
- func RegisterPlotSections()
- type Aggregator
- type Analyzer
- func (a *Analyzer) Analyze(root *node.Node) (analyze.Report, error)
- func (a *Analyzer) Configure(facts map[string]any) error
- func (a *Analyzer) CreateAggregator() analyze.ResultAggregator
- func (a *Analyzer) CreateReportSection(report analyze.Report) analyze.ReportSection
- func (a *Analyzer) CreateVisitor() analyze.AnalysisVisitor
- func (a *Analyzer) Descriptor() analyze.Descriptor
- func (a *Analyzer) Flag() string
- func (a *Analyzer) FormatReport(report analyze.Report, w io.Writer) error
- func (a *Analyzer) FormatReportBinary(report analyze.Report, w io.Writer) error
- func (a *Analyzer) FormatReportJSON(report analyze.Report, w io.Writer) error
- func (a *Analyzer) FormatReportPlot(report analyze.Report, w io.Writer) error
- func (a *Analyzer) FormatReportYAML(report analyze.Report, w io.Writer) error
- func (a *Analyzer) ListConfigurationOptions() []pipeline.ConfigurationOption
- func (a *Analyzer) Name() string
- func (a *Analyzer) Thresholds() analyze.Thresholds
- type ClonePair
- type ComputedMetrics
- type PairKey
- type ReportSection
- type Shingler
- type Visitor
Constants ¶
const ( ConfigClonesMaxClonePairs = "Clones.MaxClonePairs" ConfigClonesNumHashes = "Clones.NumHashes" ConfigClonesNumBands = "Clones.NumBands" ConfigClonesNumRows = "Clones.NumRows" ConfigClonesShingleSize = "Clones.ShingleSize" ConfigClonesSimilarityType2 = "Clones.SimilarityType2" ConfigClonesSimilarityType3 = "Clones.SimilarityType3" ConfigClonesThresholdRatioYellow = "Clones.ThresholdRatioYellow" ConfigClonesThresholdRatioRed = "Clones.ThresholdRatioRed" ConfigClonesThresholdPairsYellow = "Clones.ThresholdPairsYellow" ConfigClonesThresholdPairsRed = "Clones.ThresholdPairsRed" )
Configuration option keys for the clones analyzer.
const ( // CloneType1 represents an exact clone (identical AST structure and tokens). CloneType1 = "Type-1" // CloneType2 represents a renamed clone (identical AST structure, different tokens). CloneType2 = "Type-2" // CloneType3 represents a near-miss clone (similar AST structure). CloneType3 = "Type-3" )
Clone type constants.
const DefaultMaxClonePairs = 1000
DefaultMaxClonePairs is the maximum number of clone pairs stored in the report detail. The total_clone_pairs count remains exact (not capped). Zero means unlimited.
Variables ¶
This section is empty.
Functions ¶
func RegisterPlotSections ¶
func RegisterPlotSections()
RegisterPlotSections registers the clone detection plot section renderer.
Types ¶
type Aggregator ¶
type Aggregator struct {
// MaxClonePairs limits the number of clone pairs stored in the report detail.
// The total_clone_pairs count remains exact. Zero means unlimited.
MaxClonePairs int
NumBands int
NumRows int
SimilarityType3 float64
// contains filtered or unexported fields
}
Aggregator collects per-file function signatures and performs global cross-file clone detection in GetResult().
func NewAggregator ¶
func NewAggregator() *Aggregator
NewAggregator creates a new clone detection aggregator.
func (*Aggregator) Aggregate ¶
func (a *Aggregator) Aggregate(results map[string]analyze.Report)
Aggregate extracts function signatures from per-file reports. Signatures are qualified with the source file path so that same-named functions across files are distinguishable.
func (*Aggregator) GetResult ¶
func (a *Aggregator) GetResult() analyze.Report
GetResult builds a global LSH index from all collected signatures, finds cross-file clone pairs, and computes metrics from global totals.
type Analyzer ¶
type Analyzer struct {
// contains filtered or unexported fields
}
Analyzer provides clone detection analysis using MinHash and LSH.
func (*Analyzer) CreateAggregator ¶
func (a *Analyzer) CreateAggregator() analyze.ResultAggregator
CreateAggregator returns a new aggregator for clone analysis.
func (*Analyzer) CreateReportSection ¶
func (a *Analyzer) CreateReportSection(report analyze.Report) analyze.ReportSection
CreateReportSection creates a ReportSection from report data.
func (*Analyzer) CreateVisitor ¶
func (a *Analyzer) CreateVisitor() analyze.AnalysisVisitor
CreateVisitor creates a new visitor for single-pass traversal optimization.
func (*Analyzer) Descriptor ¶
func (a *Analyzer) Descriptor() analyze.Descriptor
Descriptor returns stable analyzer metadata.
func (*Analyzer) FormatReport ¶
FormatReport formats clone analysis results as human-readable text.
func (*Analyzer) FormatReportBinary ¶
FormatReportBinary formats clone analysis results as binary envelope.
func (*Analyzer) FormatReportJSON ¶
FormatReportJSON formats clone analysis results as JSON.
func (*Analyzer) FormatReportPlot ¶
FormatReportPlot formats clone analysis results as HTML plot.
func (*Analyzer) FormatReportYAML ¶
FormatReportYAML formats clone analysis results as YAML.
func (*Analyzer) ListConfigurationOptions ¶
func (a *Analyzer) ListConfigurationOptions() []pipeline.ConfigurationOption
ListConfigurationOptions returns configuration options.
func (*Analyzer) Thresholds ¶
func (a *Analyzer) Thresholds() analyze.Thresholds
Thresholds returns the color-coded thresholds for clone metrics.
type ClonePair ¶
type ClonePair struct {
FuncA string `json:"func_a" yaml:"func_a"`
FuncB string `json:"func_b" yaml:"func_b"`
Similarity float64 `json:"similarity" yaml:"similarity"`
CloneType string `json:"clone_type" yaml:"clone_type"`
}
ClonePair represents a detected clone relationship between two functions.
type ComputedMetrics ¶
type ComputedMetrics struct {
TotalFunctions int `json:"total_functions" yaml:"total_functions"`
TotalClonePairs int `json:"total_clone_pairs" yaml:"total_clone_pairs"`
CloneRatio float64 `json:"clone_ratio" yaml:"clone_ratio"`
CloneTypeDist map[string]int `json:"clone_type_distribution,omitempty" yaml:"clone_type_distribution,omitempty"`
ClonePairs []ClonePair `json:"clone_pairs" yaml:"clone_pairs"`
Message string `json:"message" yaml:"message"`
}
ComputedMetrics holds computed clone detection metrics for JSON/YAML/binary export.
type ReportSection ¶
type ReportSection struct {
analyze.BaseReportSection
// contains filtered or unexported fields
}
ReportSection implements the analyze.ReportSection interface for clone detection.
func NewReportSection ¶
func NewReportSection(report analyze.Report) *ReportSection
NewReportSection creates a ReportSection from clone detection report data.
func (*ReportSection) AllIssues ¶
func (s *ReportSection) AllIssues() []analyze.Issue
AllIssues returns all clone pairs as issues sorted by similarity descending.
func (*ReportSection) Distribution ¶
func (s *ReportSection) Distribution() []analyze.DistributionItem
Distribution returns clone type distribution data. Uses the full-population distribution when available, falling back to the capped pairs array.
func (*ReportSection) KeyMetrics ¶
func (s *ReportSection) KeyMetrics() []analyze.Metric
KeyMetrics returns ordered key metrics for display.
type Shingler ¶
type Shingler struct {
// contains filtered or unexported fields
}
Shingler extracts k-gram shingles from UAST function subtrees. A shingle is a sequence of k consecutive node types from a pre-order traversal.
func NewShingler ¶
NewShingler creates a new Shingler with the given k-gram size.
func (*Shingler) ExtractShingles ¶
ExtractShingles returns k-gram shingles from a function's UAST subtree. Each shingle is a byte slice representing k consecutive node types joined by "|". Returns nil if the subtree has fewer than k nodes.
type Visitor ¶
type Visitor struct {
// contains filtered or unexported fields
}
Visitor implements the AnalysisVisitor interface for clone detection. It collects function nodes during traversal and exports MinHash signatures for cross-file clone detection by the aggregator.
func (*Visitor) GetReport ¶
GetReport returns the clone detection report with function signatures. Detection is deferred to the aggregator for cross-file comparison.