clones

package
v1.0.0-rc.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 13, 2026 License: Apache-2.0, Apache-2.0 Imports: 19 Imported by: 0

Documentation

Overview

Package clones provides clone detection analysis using MinHash and LSH.

Index

Constants

View Source
const (
	ConfigClonesMaxClonePairs        = "Clones.MaxClonePairs"
	ConfigClonesNumHashes            = "Clones.NumHashes"
	ConfigClonesNumBands             = "Clones.NumBands"
	ConfigClonesNumRows              = "Clones.NumRows"
	ConfigClonesShingleSize          = "Clones.ShingleSize"
	ConfigClonesSimilarityType2      = "Clones.SimilarityType2"
	ConfigClonesSimilarityType3      = "Clones.SimilarityType3"
	ConfigClonesThresholdRatioYellow = "Clones.ThresholdRatioYellow"
	ConfigClonesThresholdRatioRed    = "Clones.ThresholdRatioRed"
	ConfigClonesThresholdPairsYellow = "Clones.ThresholdPairsYellow"
	ConfigClonesThresholdPairsRed    = "Clones.ThresholdPairsRed"
)

Configuration option keys for the clones analyzer.

View Source
const (
	// CloneType1 represents an exact clone (identical AST structure and tokens).
	CloneType1 = "Type-1"

	// CloneType2 represents a renamed clone (identical AST structure, different tokens).
	CloneType2 = "Type-2"

	// CloneType3 represents a near-miss clone (similar AST structure).
	CloneType3 = "Type-3"
)

Clone type constants.

View Source
const DefaultMaxClonePairs = 1000

DefaultMaxClonePairs is the maximum number of clone pairs stored in the report detail. The total_clone_pairs count remains exact (not capped). Zero means unlimited.

Variables

This section is empty.

Functions

func RegisterPlotSections

func RegisterPlotSections()

RegisterPlotSections registers the clone detection plot section renderer.

Types

type Aggregator

type Aggregator struct {

	// MaxClonePairs limits the number of clone pairs stored in the report detail.
	// The total_clone_pairs count remains exact. Zero means unlimited.
	MaxClonePairs   int
	NumBands        int
	NumRows         int
	SimilarityType3 float64
	// contains filtered or unexported fields
}

Aggregator collects per-file function signatures and performs global cross-file clone detection in GetResult().

func NewAggregator

func NewAggregator() *Aggregator

NewAggregator creates a new clone detection aggregator.

func (*Aggregator) Aggregate

func (a *Aggregator) Aggregate(results map[string]analyze.Report)

Aggregate extracts function signatures from per-file reports. Signatures are qualified with the source file path so that same-named functions across files are distinguishable.

func (*Aggregator) GetResult

func (a *Aggregator) GetResult() analyze.Report

GetResult builds a global LSH index from all collected signatures, finds cross-file clone pairs, and computes metrics from global totals.

type Analyzer

type Analyzer struct {
	// contains filtered or unexported fields
}

Analyzer provides clone detection analysis using MinHash and LSH.

func NewAnalyzer

func NewAnalyzer() *Analyzer

NewAnalyzer creates a new clone detection Analyzer.

func (*Analyzer) Analyze

func (a *Analyzer) Analyze(root *node.Node) (analyze.Report, error)

Analyze performs clone detection on the given UAST.

func (*Analyzer) Configure

func (a *Analyzer) Configure(facts map[string]any) error

Configure configures the analyzer.

func (*Analyzer) CreateAggregator

func (a *Analyzer) CreateAggregator() analyze.ResultAggregator

CreateAggregator returns a new aggregator for clone analysis.

func (*Analyzer) CreateReportSection

func (a *Analyzer) CreateReportSection(report analyze.Report) analyze.ReportSection

CreateReportSection creates a ReportSection from report data.

func (*Analyzer) CreateVisitor

func (a *Analyzer) CreateVisitor() analyze.AnalysisVisitor

CreateVisitor creates a new visitor for single-pass traversal optimization.

func (*Analyzer) Descriptor

func (a *Analyzer) Descriptor() analyze.Descriptor

Descriptor returns stable analyzer metadata.

func (*Analyzer) Flag

func (a *Analyzer) Flag() string

Flag returns the CLI flag for the analyzer.

func (*Analyzer) FormatReport

func (a *Analyzer) FormatReport(report analyze.Report, w io.Writer) error

FormatReport formats clone analysis results as human-readable text.

func (*Analyzer) FormatReportBinary

func (a *Analyzer) FormatReportBinary(report analyze.Report, w io.Writer) error

FormatReportBinary formats clone analysis results as binary envelope.

func (*Analyzer) FormatReportJSON

func (a *Analyzer) FormatReportJSON(report analyze.Report, w io.Writer) error

FormatReportJSON formats clone analysis results as JSON.

func (*Analyzer) FormatReportPlot

func (a *Analyzer) FormatReportPlot(report analyze.Report, w io.Writer) error

FormatReportPlot formats clone analysis results as HTML plot.

func (*Analyzer) FormatReportYAML

func (a *Analyzer) FormatReportYAML(report analyze.Report, w io.Writer) error

FormatReportYAML formats clone analysis results as YAML.

func (*Analyzer) ListConfigurationOptions

func (a *Analyzer) ListConfigurationOptions() []pipeline.ConfigurationOption

ListConfigurationOptions returns configuration options.

func (*Analyzer) Name

func (a *Analyzer) Name() string

Name returns the analyzer name.

func (*Analyzer) Thresholds

func (a *Analyzer) Thresholds() analyze.Thresholds

Thresholds returns the color-coded thresholds for clone metrics.

type ClonePair

type ClonePair struct {
	FuncA      string  `json:"func_a"     yaml:"func_a"`
	FuncB      string  `json:"func_b"     yaml:"func_b"`
	Similarity float64 `json:"similarity" yaml:"similarity"`
	CloneType  string  `json:"clone_type" yaml:"clone_type"`
}

ClonePair represents a detected clone relationship between two functions.

type ComputedMetrics

type ComputedMetrics struct {
	TotalFunctions  int            `json:"total_functions"                   yaml:"total_functions"`
	TotalClonePairs int            `json:"total_clone_pairs"                 yaml:"total_clone_pairs"`
	CloneRatio      float64        `json:"clone_ratio"                       yaml:"clone_ratio"`
	CloneTypeDist   map[string]int `json:"clone_type_distribution,omitempty" yaml:"clone_type_distribution,omitempty"`
	ClonePairs      []ClonePair    `json:"clone_pairs"                       yaml:"clone_pairs"`
	Message         string         `json:"message"                           yaml:"message"`
}

ComputedMetrics holds computed clone detection metrics for JSON/YAML/binary export.

type PairKey

type PairKey struct {
	FuncA string
	FuncB string
}

PairKey is a canonical key for a clone pair to avoid duplicates.

type ReportSection

type ReportSection struct {
	analyze.BaseReportSection
	// contains filtered or unexported fields
}

ReportSection implements the analyze.ReportSection interface for clone detection.

func NewReportSection

func NewReportSection(report analyze.Report) *ReportSection

NewReportSection creates a ReportSection from clone detection report data.

func (*ReportSection) AllIssues

func (s *ReportSection) AllIssues() []analyze.Issue

AllIssues returns all clone pairs as issues sorted by similarity descending.

func (*ReportSection) Distribution

func (s *ReportSection) Distribution() []analyze.DistributionItem

Distribution returns clone type distribution data. Uses the full-population distribution when available, falling back to the capped pairs array.

func (*ReportSection) KeyMetrics

func (s *ReportSection) KeyMetrics() []analyze.Metric

KeyMetrics returns ordered key metrics for display.

func (*ReportSection) TopIssues

func (s *ReportSection) TopIssues(n int) []analyze.Issue

TopIssues returns the top N clone pairs as issues.

type Shingler

type Shingler struct {
	// contains filtered or unexported fields
}

Shingler extracts k-gram shingles from UAST function subtrees. A shingle is a sequence of k consecutive node types from a pre-order traversal.

func NewShingler

func NewShingler(k int) *Shingler

NewShingler creates a new Shingler with the given k-gram size.

func (*Shingler) ExtractShingles

func (s *Shingler) ExtractShingles(funcNode *node.Node) [][]byte

ExtractShingles returns k-gram shingles from a function's UAST subtree. Each shingle is a byte slice representing k consecutive node types joined by "|". Returns nil if the subtree has fewer than k nodes.

type Visitor

type Visitor struct {
	// contains filtered or unexported fields
}

Visitor implements the AnalysisVisitor interface for clone detection. It collects function nodes during traversal and exports MinHash signatures for cross-file clone detection by the aggregator.

func NewVisitor

func NewVisitor() *Visitor

NewVisitor creates a new clone detection Visitor.

func (*Visitor) GetReport

func (v *Visitor) GetReport() analyze.Report

GetReport returns the clone detection report with function signatures. Detection is deferred to the aggregator for cross-file comparison.

func (*Visitor) OnEnter

func (v *Visitor) OnEnter(n *node.Node, _ int)

OnEnter is called when entering a node during traversal.

func (*Visitor) OnExit

func (v *Visitor) OnExit(_ *node.Node, _ int)

OnExit is called when exiting a node during traversal.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL