shotness

package
v0.0.0-...-cd37b43 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 12, 2026 License: Apache-2.0, Apache-2.0 Imports: 27 Imported by: 0

README

Shotness (Structural Hotness) Analysis

Preface

Knowing that a file changed is good. Knowing what part of the file changed is better.

Problem

File-level statistics are too coarse. A "utils.go" file might be huge and change constantly, but are those changes in the same function or scattered everywhere? We need fine-grained resolution.

How analyzer solves it

"Shotness" (Structural Hotness) tracks changes to specific structural elements—like functions or classes—defined by a User DSL. It tells you which functions are modified most frequently and which ones change together.

Historical context

This is an evolution of "hotspot" analysis (Adam Tornhill, Your Code as a Crime Scene), moving from file-granularity to logical-unit-granularity. The concept originated in Hercules (src-d/hercules) and has been refined here with normalized coupling strength metrics.

Real world examples

  • Testing Strategy: If ProcessPayment() changes in 50% of commits, it needs extremely robust tests.
  • Volatility Analysis: Identifying unstable functions that might need refactoring to adhere to the Open/Closed Principle.
  • Team Assessment: Functions with high coupling strength (> 0.8) are candidates for extraction into shared modules.
  • Risk Prioritization: HIGH risk nodes (≥ 20 changes) should be reviewed for design flaws, not just bugs.

How analyzer works here

  1. Configuration: User defines a DSL query (e.g., filter(.roles has "Function")) to select nodes of interest.
  2. Node Tracking: As files change, the analyzer tracks these specific named nodes via diff hunk mapping.
  3. Renames: It handles function renames (if supported by UAST diffing) to maintain history.
  4. Co-occurrence: It also tracks which functions change together (Structural Coupling).
  5. Normalization: Coupling strength is normalized to [0, 1] using the formula: co_changes / max(co_changes, changes_a, changes_b).

Output Formats

  • JSON/YAML: Structured metrics with node_hotness, node_coupling, hotspot_nodes, and aggregate sections.
  • Text: Terminal-friendly output with colored progress bars, risk classification, and coupling arrows.
  • Plot: Interactive HTML dashboard with TreeMap, HeatMap, and Bar Chart visualizations.

Metrics

  • Hotness Score: Normalized [0, 1] relative to the most changed function.
  • Coupling Strength: Normalized [0, 1] confidence metric for co-change pairs.
  • Risk Level: HIGH (≥ 20), MEDIUM (≥ 10), LOW (< 10) change count thresholds.
  • Aggregate: Summary statistics including average coupling strength across all pairs.

Limitations

  • Performance: Fine-grained UAST diffing is more expensive than file-level diffing.
  • DSL Complexity: Requires understanding the UAST structure to write effective queries.
  • Large Functions: Any change within a function's line range counts as a change to that function.

Further plans

  • Pre-defined queries for common languages.
  • Temporal decay: weight recent changes higher than old ones.
  • Cross-repository coupling analysis.

Documentation

Overview

Package shotness provides shotness functionality.

Index

Constants

View Source
const (
	// ConfigShotnessDSLStruct is the configuration key for the DSL structure expression.
	ConfigShotnessDSLStruct = "Shotness.DSLStruct"
	// ConfigShotnessDSLName is the configuration key for the DSL name expression.
	ConfigShotnessDSLName = "Shotness.DSLName"
	// DefaultShotnessDSLStruct is the default DSL expression for selecting code structures.
	DefaultShotnessDSLStruct = "filter(.roles has \"Function\")"
	// DefaultShotnessDSLName is the default DSL expression for extracting names.
	DefaultShotnessDSLName = ".props.name"
)
View Source
const (
	HotspotThresholdHigh   = 20
	HotspotThresholdMedium = 10
)

Hotspot thresholds.

View Source
const (
	RiskLevelHigh   = "HIGH"
	RiskLevelMedium = "MEDIUM"
	RiskLevelLow    = "LOW"
)

Risk level constants.

View Source
const (
	KindNodeData  = "node_data"
	KindAggregate = "aggregate"
)

Store record kind constants.

Variables

View Source
var ErrInvalidCounters = errors.New("invalid shotness report: expected []map[int]int for Counters")

ErrInvalidCounters indicates the report doesn't contain expected counters data.

View Source
var ErrInvalidNodes = errors.New("invalid shotness report: expected []NodeSummary for Nodes")

ErrInvalidNodes indicates the report doesn't contain expected nodes data.

Functions

func GenerateStoreSections

func GenerateStoreSections(reader analyze.ReportReader) ([]plotpage.Section, error)

GenerateStoreSections reads pre-computed shotness data from a ReportReader and builds the same plot sections as GenerateSections, without materializing a full Report or recomputing the co-change matrix.

func RegisterPlotSections

func RegisterPlotSections()

RegisterPlotSections registers the shotness plot section renderer with the analyze package.

Types

type AggregateData

type AggregateData struct {
	TotalNodes          int     `json:"total_nodes"           yaml:"total_nodes"`
	TotalChanges        int     `json:"total_changes"         yaml:"total_changes"`
	TotalCouplings      int     `json:"total_couplings"       yaml:"total_couplings"`
	AvgChangesPerNode   float64 `json:"avg_changes_per_node"  yaml:"avg_changes_per_node"`
	AvgCouplingStrength float64 `json:"avg_coupling_strength" yaml:"avg_coupling_strength"`
	HotNodes            int     `json:"hot_nodes"             yaml:"hot_nodes"`
}

AggregateData contains summary statistics.

type Analyzer

type Analyzer struct {
	*analyze.BaseHistoryAnalyzer[*ComputedMetrics]

	FileDiff *plumbing.FileDiffAnalyzer
	UAST     *plumbing.UASTChangesAnalyzer

	DSLStruct string
	DSLName   string
	// contains filtered or unexported fields
}

Analyzer measures co-change frequency of code entities across commit history.

func NewAnalyzer

func NewAnalyzer() *Analyzer

NewAnalyzer creates a new shotness analyzer.

func (*Analyzer) ApplySnapshot

func (s *Analyzer) ApplySnapshot(snap analyze.PlumbingSnapshot)

ApplySnapshot restores plumbing state from a previously captured snapshot.

func (*Analyzer) Boot

func (s *Analyzer) Boot() error

Boot restores the analyzer from hibernated state. Re-initializes the merge tracker for the next chunk.

func (*Analyzer) Configure

func (s *Analyzer) Configure(facts map[string]any) error

Configure sets up the analyzer with the provided facts.

func (*Analyzer) Consume

func (s *Analyzer) Consume(ctx context.Context, ac *analyze.Context) (analyze.TC, error)

Consume processes a single commit with the provided dependency results.

func (*Analyzer) DiscardState

func (s *Analyzer) DiscardState()

DiscardState clears cumulative node coupling state. In streaming timeseries mode, per-commit data is already captured in the TC; the accumulated nodes map (which grows O(N²) with coupling pairs) is only needed for the final report and can be discarded between chunks.

func (*Analyzer) ExtractCommitTimeSeries

func (s *Analyzer) ExtractCommitTimeSeries(report analyze.Report) map[string]any

ExtractCommitTimeSeries implements analyze.CommitTimeSeriesProvider. It extracts per-commit structural hotspot data for the unified timeseries output.

func (*Analyzer) Fork

func (s *Analyzer) Fork(n int) []analyze.HistoryAnalyzer

Fork creates a copy of the analyzer for parallel processing. Each fork gets independent mutable state while sharing read-only dependencies.

func (*Analyzer) GenerateChart

func (s *Analyzer) GenerateChart(report analyze.Report) (components.Charter, error)

GenerateChart creates a bar chart showing the hottest functions.

func (*Analyzer) GenerateSections

func (s *Analyzer) GenerateSections(report analyze.Report) ([]plotpage.Section, error)

GenerateSections returns the sections for combined reports.

func (*Analyzer) Hibernate

func (s *Analyzer) Hibernate() error

Hibernate compresses the analyzer's state to reduce memory usage. Resets the merge tracker since processed commits won't be seen again during streaming (commits are processed chronologically).

func (*Analyzer) Initialize

func (s *Analyzer) Initialize(_ *gitlib.Repository) error

Initialize prepares the analyzer for processing commits.

func (*Analyzer) Merge

func (s *Analyzer) Merge(branches []analyze.HistoryAnalyzer)

Merge combines results from forked analyzer branches.

func (*Analyzer) Name

func (s *Analyzer) Name() string

Name returns the analyzer name.

func (*Analyzer) NeedsUAST

func (s *Analyzer) NeedsUAST() bool

NeedsUAST returns true to enable the UAST pipeline.

func (*Analyzer) ReleaseSnapshot

func (s *Analyzer) ReleaseSnapshot(snap analyze.PlumbingSnapshot)

ReleaseSnapshot releases UAST trees owned by the snapshot.

func (*Analyzer) SnapshotPlumbing

func (s *Analyzer) SnapshotPlumbing() analyze.PlumbingSnapshot

SnapshotPlumbing captures the current plumbing output state for parallel execution.

func (*Analyzer) WriteToStore

func (s *Analyzer) WriteToStore(ctx context.Context, ticks []analyze.TICK, w analyze.ReportWriter) error

WriteToStore implements analyze.StoreWriter. It merges per-tick node data, builds the co-change matrix, and streams pre-computed results as individual records:

  • "node_data": per-node NodeStoreRecord records (sorted by node key).
  • "aggregate": single AggregateData record.

type CommitData

type CommitData struct {
	// NodesTouched maps node key to its delta for this commit.
	NodesTouched map[string]NodeDelta
}

CommitData is the per-commit TC payload emitted by Consume(). It captures per-commit node touch deltas; coupling pairs are derived inline by the aggregator from NodesTouched keys to avoid O(N²) allocation.

type CommitSummary

type CommitSummary struct {
	NodesTouched  int `json:"nodes_touched"`
	CouplingPairs int `json:"coupling_pairs"`
}

CommitSummary holds per-commit summary data for timeseries output.

type ComputedMetrics

type ComputedMetrics struct {
	NodeHotness  []NodeHotnessData  `json:"node_hotness"  yaml:"node_hotness"`
	NodeCoupling []NodeCouplingData `json:"node_coupling" yaml:"node_coupling"`
	HotspotNodes []HotspotNodeData  `json:"hotspot_nodes" yaml:"hotspot_nodes"`
	Aggregate    AggregateData      `json:"aggregate"     yaml:"aggregate"`
}

ComputedMetrics holds all computed metric results for the shotness analyzer.

func ComputeAllMetrics

func ComputeAllMetrics(report analyze.Report) (*ComputedMetrics, error)

ComputeAllMetrics runs all shotness metrics and returns the results.

func (*ComputedMetrics) AnalyzerName

func (m *ComputedMetrics) AnalyzerName() string

AnalyzerName returns the analyzer identifier.

func (*ComputedMetrics) ToJSON

func (m *ComputedMetrics) ToJSON() any

ToJSON returns the metrics in JSON-serializable format.

func (*ComputedMetrics) ToYAML

func (m *ComputedMetrics) ToYAML() any

ToYAML returns the metrics in YAML-serializable format.

type HotspotNodeData

type HotspotNodeData struct {
	Name        string `json:"name"         yaml:"name"`
	Type        string `json:"type"         yaml:"type"`
	File        string `json:"file"         yaml:"file"`
	ChangeCount int    `json:"change_count" yaml:"change_count"`
	RiskLevel   string `json:"risk_level"   yaml:"risk_level"`
}

HotspotNodeData identifies hot nodes that change frequently.

type NodeCouplingData

type NodeCouplingData struct {
	Node1Name string  `json:"node1_name"        yaml:"node1_name"`
	Node1File string  `json:"node1_file"        yaml:"node1_file"`
	Node2Name string  `json:"node2_name"        yaml:"node2_name"`
	Node2File string  `json:"node2_file"        yaml:"node2_file"`
	CoChanges int     `json:"co_changes"        yaml:"co_changes"`
	Strength  float64 `json:"coupling_strength" yaml:"coupling_strength"`
}

NodeCouplingData contains coupling between code nodes.

type NodeDelta

type NodeDelta struct {
	// Summary identifies the node (type, name, file).
	Summary NodeSummary
	// CountDelta is the change count increment (1 for first touch in a commit, 0 otherwise).
	CountDelta int
}

NodeDelta represents a single node's contribution in one commit.

type NodeHotnessData

type NodeHotnessData struct {
	Name         string  `json:"name"          yaml:"name"`
	Type         string  `json:"type"          yaml:"type"`
	File         string  `json:"file"          yaml:"file"`
	ChangeCount  int     `json:"change_count"  yaml:"change_count"`
	CoupledNodes int     `json:"coupled_nodes" yaml:"coupled_nodes"`
	HotnessScore float64 `json:"hotness_score" yaml:"hotness_score"`
}

NodeHotnessData contains hotness information for a code node.

type NodeStoreRecord

type NodeStoreRecord struct {
	Summary NodeSummary
	Counter map[int]int
}

NodeStoreRecord holds a single node's summary and co-change counter map. Counter keys are node indices into the ordered node list; counter values are co-change counts; counter[self] is the node's self-change count.

type NodeSummary

type NodeSummary struct {
	Type string
	Name string
	File string
}

NodeSummary holds identifying information for a code node.

func (*NodeSummary) String

func (ns *NodeSummary) String() string

type ReportData

type ReportData struct {
	Nodes    []NodeSummary
	Counters []map[int]int
}

ReportData is the parsed input data for shotness metrics computation.

func ParseReportData

func ParseReportData(report analyze.Report) (*ReportData, error)

ParseReportData extracts ReportData from an analyzer report.

type TickData

type TickData struct {
	// Nodes maps node key to accumulated node data.
	Nodes map[string]*nodeShotnessData
	// CommitStats holds per-commit summary data for timeseries output.
	CommitStats map[string]*CommitSummary
}

TickData is the per-tick aggregated payload stored in analyze.TICK.Data.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL