wcanalyze

package
v1.0.0-beta.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 22, 2026 License: Apache-2.0 Imports: 9 Imported by: 0

Documentation

Overview

Package wcanalyze reconstructs an operation graph from a wcprof dump and runs offline wall-clock bottleneck analysis over it: self-time accounting, a replay-based counterfactual simulator, and per-class what-if rankings.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ActualMakespanNS

func ActualMakespanNS(g *Graph) int64

ActualMakespanNS returns the observed makespan over root ops.

func DeadAir

func DeadAir(g *Graph, thresholdNS int64) []segment

DeadAir returns gaps (longer than threshold) inside the trace where no recorded op was running.

func WriteReport

func WriteReport(w io.Writer, g *Graph, opts ReportOptions) error

WriteReport renders the full human-readable analysis.

Types

type ClassKey

type ClassKey struct {
	Kind  string
	Class string
}

ClassKey identifies an operation class for aggregation: ops are grouped by kind+class (e.g. call_exec / "Container.withExec").

func (ClassKey) String

func (k ClassKey) String() string

type ClassStats

type ClassStats struct {
	Key      ClassKey
	WorkType string
	Count    int
	Outcomes map[string]int

	TotalWallNS int64
	TotalSelfNS int64
	MaxSelfNS   int64
	P50SelfNS   int64
	P95SelfNS   int64

	// DupExecuted counts ops beyond the first execution per ident (same
	// operation executed more than once).
	DupExecuted int
}

ClassStats aggregates ops of one class.

func AggregateClasses

func AggregateClasses(g *Graph) []*ClassStats

AggregateClasses computes per-class statistics over all ops.

type DriftOrigin

type DriftOrigin struct {
	Op         *Op
	DurDriftNS int64
	// OwnDriftNS is DurDrift minus the largest drift among children and wait
	// targets.
	OwnDriftNS int64
}

DriftOrigin is an op whose baseline-simulated duration inflated beyond its recorded duration by more than its dependencies' inflation explains: the place where replay-model error is introduced (rather than inherited).

func DriftOrigins

func DriftOrigins(g *Graph, minOwnDriftNS int64, topN int) []DriftOrigin

DriftOrigins finds where baseline replay error originates, comparing each op's simulated finish lateness (vs its recorded end) against the worst lateness among its dependencies.

type Graph

type Graph struct {
	Ops   map[uint64]*Op
	Roots []*Op // ops with no (resolved) parent, sorted by StartNS

	// OrphanWaits are waits whose waiter op is unknown.
	OrphanWaits []*WaitEdge

	DroppedEvents uint64
	OpenOps       int

	// TraceStartNS/TraceEndNS bound all recorded activity.
	TraceStartNS int64
	TraceEndNS   int64
	// contains filtered or unexported fields
}

Graph is the reconstructed op graph for one dump.

func Build

func Build(header *wcprof.DumpHeader, events []wcprof.DumpEvent) (*Graph, error)

Build reconstructs the op graph from parsed dump data.

func Load

func Load(r io.Reader) (*Graph, error)

Load reads a wcprof dump and reconstructs the op graph.

func LoadMulti

func LoadMulti(readers []io.Reader) (*Graph, error)

LoadMulti reads multiple dumps taken from the same recorder (periodic drains of one engine run) and reconstructs one combined op graph.

This relies on recorder guarantees across flushes: the string table only grows (IDs are stable), op IDs are globally unique, the epoch is fixed, and the dropped-event counter is cumulative. The merge keeps the header with the longest string table and latest open-ops view, and concatenates all events.

type Op

type Op struct {
	ID       uint64
	ParentID uint64
	Kind     string
	WorkType string
	Outcome  string
	Class    string
	Ident    string
	ClientID string
	ResultID uint64
	StartNS  int64
	EndNS    int64
	// Open marks ops that had not ended at dump time; EndNS is the dump time.
	Open bool

	Parent   *Op
	Children []*Op // sorted by StartNS
	Waits    []*WaitEdge
	// Reparented marks ops whose parent was assigned via a nested-client
	// link rather than a recorded parent ID.
	Reparented bool
	// contains filtered or unexported fields
}

Op is one reconstructed operation interval.

func BlockingChain

func BlockingChain(g *Graph, maxDepth int) []*Op

BlockingChain walks back from the op that finishes last in the baseline simulation, at each step following the child or wait whose interval ends latest, yielding an approximate end-of-workload critical chain.

func (*Op) Duration

func (op *Op) Duration() int64

func (*Op) Key

func (op *Op) Key() ClassKey

Key returns the op's aggregation class key.

func (*Op) SelfNS

func (op *Op) SelfNS() int64

SelfNS is the total self time of the op.

func (*Op) SelfSegments

func (op *Op) SelfSegments() []segment

SelfSegments returns the op's interval minus its children's intervals and its own wait intervals: the time the op was plausibly doing its own work.

type OpDrift

type OpDrift struct {
	Op           *Op
	SimStartNS   int64
	SimFinishNS  int64
	StartDriftNS int64 // simStart - origStart
	DurDriftNS   int64 // (simFinish-simStart) - origDuration
}

OpDrift describes how far an op's simulated schedule diverged from its recorded one in a baseline replay (factor 1 everywhere). Large positive drift indicates the replay model over-constrains that op.

func BaselineDrift

func BaselineDrift(g *Graph, topN int) (durDrift, startDrift []OpDrift)

BaselineDrift replays at factor 1 and returns the ops whose simulated duration grew the most versus their recorded duration, plus the ops whose simulated start moved latest. Used to debug replay-model fidelity.

type ReportOptions

type ReportOptions struct {
	TopClasses     int
	WhatIfFactors  []float64
	MinClassSelfNS int64
	DeadAirMinNS   int64
	ChainDepth     int
}

ReportOptions controls WriteReport.

type Simulation

type Simulation struct {

	// Factors scales self-time per class; missing keys mean 1.0.
	Factors map[ClassKey]float64

	// CycleWarnings counts wait/join cycles broken during replay.
	CycleWarnings int
	// FallbackAnchors counts ops anchored without their parent's replay
	// reaching their spawn (parent in flight or inconsistent data). They
	// anchor in the parent's shifted frame at their recorded offset.
	FallbackAnchors int
	// FallbackAnchorOps holds a sample of fallback-anchored ops.
	FallbackAnchorOps []*Op
	// contains filtered or unexported fields
}

Simulation replays the compiled program under per-class self-time factors.

func NewSimulation

func NewSimulation(g *Graph, factors map[ClassKey]float64) *Simulation

NewSimulation prepares a replay over g with the given per-class self-time factors (nil means baseline).

func (*Simulation) ExplainFinish

func (s *Simulation) ExplainFinish(op *Op, maxDepth int) []*Op

ExplainFinish walks the constraint chain from op downward through the dependency (child/wait-target) with the latest simulated finish at each step. Used for debugging replay-model fidelity and as a simulated critical chain.

func (*Simulation) Run

func (s *Simulation) Run() (makespanNS int64, err error)

Run replays all roots and returns the simulated makespan: the latest root finish minus the earliest root start.

func (*Simulation) SimTimes

func (s *Simulation) SimTimes(op *Op) (startNS, finishNS int64)

SimTimes returns the simulated start/finish for an op (zero values when the op was not reached by the replay).

type WaitEdge

type WaitEdge struct {
	Waiter      *Op // nil if the waiting code had no profiled op
	Target      *Op // nil for unresolved or resource waits
	TargetIdent string
	Reason      string
	StartNS     int64
	EndNS       int64
}

WaitEdge is one recorded blocked-on interval.

func (*WaitEdge) Duration

func (w *WaitEdge) Duration() int64

type WhatIfResult

type WhatIfResult struct {
	Key ClassKey
	// SavedNS[f] is baseline makespan minus the makespan with the class
	// scaled by factor f.
	SavedNS map[float64]int64
}

WhatIfResult is the simulated impact of scaling one class's self-time.

func RunWhatIfs

func RunWhatIfs(g *Graph, factors []float64, minSelfNS int64) (baselineNS int64, results []WhatIfResult, err error)

RunWhatIfs computes baseline makespan and, for every class with total self time >= minSelfNS (up to maxWhatIfClasses, by total self-time), the makespan saving when scaling that class's self time by each factor. Simulations run in parallel.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL